Matt Godbolt via llvm-dev
2016-Mar-30 12:53 UTC
[llvm-dev] JIT compiler and calls to existing functions
For what it's worth we did a similar thing, but overrode RTDyldMemoryManager directly This allowed us to control where the RAM was allocated too (e.g. guarantee it was in the low 4GB so we could use small memory model and avoid the mov rax, xxxxxxx; call rax code generated for x86)*, and also override findSymbol() to have the same behaviour as described in 4). --matt * later issues in not always being able to allocate low ram caused us to drop this feature. I wish we could fix the "call rax"es :) On Tue, Mar 29, 2016 at 6:51 PM Philip Reames via llvm-dev < llvm-dev at lists.llvm.org> wrote:> There is no documentation I know of. Rough sketch: > 1) Create a subclass of SectionMemoryManager > 2) Create an instance of this class and add it the EngineBuilder via > setMCJITMemoryManager > (Make sure everything runs without changes) > 3) Override the getSymbolAddress method, have your implementation call the > base classes impl > (Make sure everything runs without changes) > 4) Add handling to map name->address for any custom functions you want. > > > > On 03/28/2016 09:18 PM, Russell Wallace wrote: > > True, I care more about how fast the code runs than how long it takes to > compile it. So if the symbolic approach enables better code generation, > that is a very significant advantage from my perspective. > > Is there any example code or documentation you can point to for details > about how to implement the symbolic approach? Is it similar to any of the > versions of Kaleidoscope or any other extant tutorial? > > On Tue, Mar 29, 2016 at 5:07 AM, Philip Reames <listmail at philipreames.com> > wrote: > >> Other advantages of keeping things symbolic: >> 1) You can use function attributes to provide optimization or semantic >> information. >> 2) Linking modules works as expected when one of them contains the >> definition. >> 3) You can get better code generation (i.e. pc relative addressing for >> local symbols, etc..) >> >> If the inttoptr scheme makes you happy, go for it. I'm not telling you >> its wrong, merely that there's another approach you should consider which >> has it's own advantages. >> >> Philip >> >> p.s. Your point about compiling faster is off base. Not because you're >> wrong, but because if you're trying to optimize for compile speed at that >> granularity, you really don't want to be using LLVM (or just about any >> framework.) LLVM does not make a good first tier JIT. It's makes a great >> high tier JIT, but if you really really care about compile time, it is not >> the appropriate answer. >> >> On 03/28/2016 08:57 PM, Russell Wallace wrote: >> >> Ah! Okay then, so you are saying something substantive that I think I >> disagree with, but that could be because there are relevant issues I don't >> understand. >> >> My reasoning is, I've already got a pointer to the function I want the >> generated code to call, so I just supply that pointer, it looks ugly on a >> microscopic scale because there are a couple of lines of casts to shepherd >> it through the type system, but on an even slightly larger scale it's as >> simple and elegant as it gets: I supply a 64-bit machine word that gets >> compiled into the object code, there are no extra layers of machinery to go >> wrong, no nonlocal effects to understand, no having to wonder about whether >> anything depends on symbol information that might vary between debug and >> release builds or between the vendor compiler and clang or between Windows >> and Linux; if it works once, it will work the same way every time. As a >> bonus, it will compile slightly faster. >> >> Keeping things symbolic - you're saying the advantage is the intermediate >> code is easier to debug because a dump will say 'call print' instead of >> 'call function at 123456'? I can see that being a real advantage but as of >> right now it doesn't jump out at me as necessarily outweighing the >> advantages of the direct pointer method. >> >> >> >> On Tue, Mar 29, 2016 at 4:37 AM, Philip Reames < >> <listmail at philipreames.com>listmail at philipreames.com> wrote: >> >>> I think our use cases are actually quite similar. Part of generating >>> the in memory executable code is resolving all the symbolic symbols and >>> relocations. The details of this are mostly hidden from you by the MCJIT >>> interface, but it's this step I was referring to as "link time". >>> >>> The way to think of MCJIT: generate object file, incrementally link, run >>> dynamic loader, but do it all in memory without round tripping through disk >>> or explicit files. >>> >>> Philip >>> >>> >>> >>> On Mar 28, 2016, at 7:25 PM, Russell Wallace <russell.wallace at gmail.com> >>> wrote: >>> >>> Right, but when you say link time, the JIT compiler I'm writing works >>> the way openJDK or v8 do, it reads a script, JIT compiles it into memory >>> and runs the code in memory without ever writing anything to disk (an >>> option for ahead of time compilation may come later, but that will be a >>> while down the road), so we might be doing different things? >>> >>> On Tue, Mar 29, 2016 at 2:59 AM, Philip Reames < >>> <listmail at philipreames.com>listmail at philipreames.com> wrote: >>> >>>> The option we use is to have a custom memory manager, override the >>>> getPointerToNamedFunction function, and provide the pointer to the external >>>> function at link time. The inttoptr scheme works fairly well, but it does >>>> make for some pretty ugly and sometimes hard to analyze IR. I recommend >>>> leaving everything symbolic until link time if you can. >>>> >>>> Philip >>>> >>>> >>>> On 03/28/2016 06:33 PM, Russell Wallace via llvm-dev wrote: >>>> >>>> That seems to work, thanks! The specific code I ended up with to call >>>> int64_t print(int64_t) looks like: >>>> >>>> auto f = builder.CreateIntToPtr( >>>> ConstantInt::get(builder.getInt64Ty(), uintptr_t(print)), >>>> PointerType::getUnqual(FunctionType::get( >>>> builder.getInt64Ty(), {builder.getInt64Ty()}, false))); >>>> return builder.CreateCall(f, args); >>>> >>>> >>>> On Mon, Mar 28, 2016 at 1:40 PM, Caldarale, Charles R < >>>> <Chuck.Caldarale at unisys.com>Chuck.Caldarale at unisys.com> wrote: >>>> >>>>> > From: llvm-dev [mailto: <llvm-dev-bounces at lists.llvm.org> >>>>> llvm-dev-bounces at lists.llvm.org] >>>>> > On Behalf Of Russell Wallace via llvm-dev >>>>> > Subject: [llvm-dev] JIT compiler and calls to existing functions >>>>> >>>>> > In the context of a JIT compiler, what's the recommended way to >>>>> generate a call to an >>>>> > existing function, that is, not one that you are generating >>>>> on-the-fly with LLVM, but >>>>> > one that's already linked into your program? For example the cosine >>>>> function (from the >>>>> > standard math library); the Kaleidoscope tutorial recommends looking >>>>> it up by name with >>>>> > dlsym("cos"), but it seems to me that it should be possible to use a >>>>> more efficient and >>>>> > portable solution that takes advantage of the fact that you already >>>>> have an actual pointer >>>>> > to cos, even if you haven't linked with debugging symbols. >>>>> >>>>> Perhaps not the most elegant, but we simply use the >>>>> IRBuilder.CreateIntToPtr() method to construct the Callee argument for >>>>> IRBuilder.CreateCall(). The first argument for CreateIntToPtr() comes from >>>>> ConstantInt::get(I64, uintptr_t(ptr)), while the second is a function type >>>>> pointer defined by using PointerType::get() on the result of >>>>> FunctionType::get() with the appropriate function signature. >>>>> >>>>> - Chuck >>>>> >>>>> >>>> >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> >>>> >>> >> >> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160330/1b036008/attachment.html>
Russell Wallace via llvm-dev
2016-Mar-30 14:46 UTC
[llvm-dev] JIT compiler and calls to existing functions
Can't the code generator do this opportunistically? That is, generate the more compact instruction sequence if the address happens to be within four gigabytes, otherwise generate the longer form? On Wed, Mar 30, 2016 at 1:53 PM, Matt Godbolt <matt at godbolt.org> wrote:> For what it's worth we did a similar thing, but > overrode RTDyldMemoryManager directly This allowed us to control where the > RAM was allocated too (e.g. guarantee it was in the low 4GB so we could use > small memory model and avoid the mov rax, xxxxxxx; call rax code generated > for x86)*, and also override findSymbol() to have the same behaviour as > described in 4). > > --matt > > * later issues in not always being able to allocate low ram caused us to > drop this feature. I wish we could fix the "call rax"es :) > > On Tue, Mar 29, 2016 at 6:51 PM Philip Reames via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> There is no documentation I know of. Rough sketch: >> 1) Create a subclass of SectionMemoryManager >> 2) Create an instance of this class and add it the EngineBuilder via >> setMCJITMemoryManager >> (Make sure everything runs without changes) >> 3) Override the getSymbolAddress method, have your implementation call >> the base classes impl >> (Make sure everything runs without changes) >> 4) Add handling to map name->address for any custom functions you want. >> >> >> >> On 03/28/2016 09:18 PM, Russell Wallace wrote: >> >> True, I care more about how fast the code runs than how long it takes to >> compile it. So if the symbolic approach enables better code generation, >> that is a very significant advantage from my perspective. >> >> Is there any example code or documentation you can point to for details >> about how to implement the symbolic approach? Is it similar to any of the >> versions of Kaleidoscope or any other extant tutorial? >> >> On Tue, Mar 29, 2016 at 5:07 AM, Philip Reames <listmail at philipreames.com >> > wrote: >> >>> Other advantages of keeping things symbolic: >>> 1) You can use function attributes to provide optimization or semantic >>> information. >>> 2) Linking modules works as expected when one of them contains the >>> definition. >>> 3) You can get better code generation (i.e. pc relative addressing for >>> local symbols, etc..) >>> >>> If the inttoptr scheme makes you happy, go for it. I'm not telling you >>> its wrong, merely that there's another approach you should consider which >>> has it's own advantages. >>> >>> Philip >>> >>> p.s. Your point about compiling faster is off base. Not because you're >>> wrong, but because if you're trying to optimize for compile speed at that >>> granularity, you really don't want to be using LLVM (or just about any >>> framework.) LLVM does not make a good first tier JIT. It's makes a great >>> high tier JIT, but if you really really care about compile time, it is not >>> the appropriate answer. >>> >>> On 03/28/2016 08:57 PM, Russell Wallace wrote: >>> >>> Ah! Okay then, so you are saying something substantive that I think I >>> disagree with, but that could be because there are relevant issues I don't >>> understand. >>> >>> My reasoning is, I've already got a pointer to the function I want the >>> generated code to call, so I just supply that pointer, it looks ugly on a >>> microscopic scale because there are a couple of lines of casts to shepherd >>> it through the type system, but on an even slightly larger scale it's as >>> simple and elegant as it gets: I supply a 64-bit machine word that gets >>> compiled into the object code, there are no extra layers of machinery to go >>> wrong, no nonlocal effects to understand, no having to wonder about whether >>> anything depends on symbol information that might vary between debug and >>> release builds or between the vendor compiler and clang or between Windows >>> and Linux; if it works once, it will work the same way every time. As a >>> bonus, it will compile slightly faster. >>> >>> Keeping things symbolic - you're saying the advantage is the >>> intermediate code is easier to debug because a dump will say 'call print' >>> instead of 'call function at 123456'? I can see that being a real advantage >>> but as of right now it doesn't jump out at me as necessarily outweighing >>> the advantages of the direct pointer method. >>> >>> >>> >>> On Tue, Mar 29, 2016 at 4:37 AM, Philip Reames < >>> <listmail at philipreames.com>listmail at philipreames.com> wrote: >>> >>>> I think our use cases are actually quite similar. Part of generating >>>> the in memory executable code is resolving all the symbolic symbols and >>>> relocations. The details of this are mostly hidden from you by the MCJIT >>>> interface, but it's this step I was referring to as "link time". >>>> >>>> The way to think of MCJIT: generate object file, incrementally link, >>>> run dynamic loader, but do it all in memory without round tripping through >>>> disk or explicit files. >>>> >>>> Philip >>>> >>>> >>>> >>>> On Mar 28, 2016, at 7:25 PM, Russell Wallace <russell.wallace at gmail.com> >>>> wrote: >>>> >>>> Right, but when you say link time, the JIT compiler I'm writing works >>>> the way openJDK or v8 do, it reads a script, JIT compiles it into memory >>>> and runs the code in memory without ever writing anything to disk (an >>>> option for ahead of time compilation may come later, but that will be a >>>> while down the road), so we might be doing different things? >>>> >>>> On Tue, Mar 29, 2016 at 2:59 AM, Philip Reames < >>>> <listmail at philipreames.com>listmail at philipreames.com> wrote: >>>> >>>>> The option we use is to have a custom memory manager, override the >>>>> getPointerToNamedFunction function, and provide the pointer to the external >>>>> function at link time. The inttoptr scheme works fairly well, but it does >>>>> make for some pretty ugly and sometimes hard to analyze IR. I recommend >>>>> leaving everything symbolic until link time if you can. >>>>> >>>>> Philip >>>>> >>>>> >>>>> On 03/28/2016 06:33 PM, Russell Wallace via llvm-dev wrote: >>>>> >>>>> That seems to work, thanks! The specific code I ended up with to call >>>>> int64_t print(int64_t) looks like: >>>>> >>>>> auto f = builder.CreateIntToPtr( >>>>> ConstantInt::get(builder.getInt64Ty(), uintptr_t(print)), >>>>> PointerType::getUnqual(FunctionType::get( >>>>> builder.getInt64Ty(), {builder.getInt64Ty()}, false))); >>>>> return builder.CreateCall(f, args); >>>>> >>>>> >>>>> On Mon, Mar 28, 2016 at 1:40 PM, Caldarale, Charles R < >>>>> <Chuck.Caldarale at unisys.com>Chuck.Caldarale at unisys.com> wrote: >>>>> >>>>>> > From: llvm-dev [mailto: <llvm-dev-bounces at lists.llvm.org> >>>>>> llvm-dev-bounces at lists.llvm.org] >>>>>> > On Behalf Of Russell Wallace via llvm-dev >>>>>> > Subject: [llvm-dev] JIT compiler and calls to existing functions >>>>>> >>>>>> > In the context of a JIT compiler, what's the recommended way to >>>>>> generate a call to an >>>>>> > existing function, that is, not one that you are generating >>>>>> on-the-fly with LLVM, but >>>>>> > one that's already linked into your program? For example the cosine >>>>>> function (from the >>>>>> > standard math library); the Kaleidoscope tutorial recommends >>>>>> looking it up by name with >>>>>> > dlsym("cos"), but it seems to me that it should be possible to use >>>>>> a more efficient and >>>>>> > portable solution that takes advantage of the fact that you already >>>>>> have an actual pointer >>>>>> > to cos, even if you haven't linked with debugging symbols. >>>>>> >>>>>> Perhaps not the most elegant, but we simply use the >>>>>> IRBuilder.CreateIntToPtr() method to construct the Callee argument for >>>>>> IRBuilder.CreateCall(). The first argument for CreateIntToPtr() comes from >>>>>> ConstantInt::get(I64, uintptr_t(ptr)), while the second is a function type >>>>>> pointer defined by using PointerType::get() on the result of >>>>>> FunctionType::get() with the appropriate function signature. >>>>>> >>>>>> - Chuck >>>>>> >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >>>>> >>>>> >>>> >>> >>> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160330/5f55b815/attachment.html>
Matt Godbolt via llvm-dev
2016-Mar-30 14:47 UTC
[llvm-dev] JIT compiler and calls to existing functions
That would be ideal, but the code generator needs to know the relative distance from the call site to the called function: which is only known at link-time. On Wed, Mar 30, 2016 at 9:46 AM Russell Wallace <russell.wallace at gmail.com> wrote:> Can't the code generator do this opportunistically? That is, generate the > more compact instruction sequence if the address happens to be within four > gigabytes, otherwise generate the longer form? > > On Wed, Mar 30, 2016 at 1:53 PM, Matt Godbolt <matt at godbolt.org> wrote: > >> For what it's worth we did a similar thing, but >> overrode RTDyldMemoryManager directly This allowed us to control where the >> RAM was allocated too (e.g. guarantee it was in the low 4GB so we could use >> small memory model and avoid the mov rax, xxxxxxx; call rax code generated >> for x86)*, and also override findSymbol() to have the same behaviour as >> described in 4). >> >> --matt >> >> * later issues in not always being able to allocate low ram caused us to >> drop this feature. I wish we could fix the "call rax"es :) >> >> On Tue, Mar 29, 2016 at 6:51 PM Philip Reames via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> There is no documentation I know of. Rough sketch: >>> 1) Create a subclass of SectionMemoryManager >>> 2) Create an instance of this class and add it the EngineBuilder via >>> setMCJITMemoryManager >>> (Make sure everything runs without changes) >>> 3) Override the getSymbolAddress method, have your implementation call >>> the base classes impl >>> (Make sure everything runs without changes) >>> 4) Add handling to map name->address for any custom functions you want. >>> >>> >>> >>> On 03/28/2016 09:18 PM, Russell Wallace wrote: >>> >>> True, I care more about how fast the code runs than how long it takes to >>> compile it. So if the symbolic approach enables better code generation, >>> that is a very significant advantage from my perspective. >>> >>> Is there any example code or documentation you can point to for details >>> about how to implement the symbolic approach? Is it similar to any of the >>> versions of Kaleidoscope or any other extant tutorial? >>> >>> On Tue, Mar 29, 2016 at 5:07 AM, Philip Reames < >>> listmail at philipreames.com> wrote: >>> >>>> Other advantages of keeping things symbolic: >>>> 1) You can use function attributes to provide optimization or semantic >>>> information. >>>> 2) Linking modules works as expected when one of them contains the >>>> definition. >>>> 3) You can get better code generation (i.e. pc relative addressing for >>>> local symbols, etc..) >>>> >>>> If the inttoptr scheme makes you happy, go for it. I'm not telling you >>>> its wrong, merely that there's another approach you should consider which >>>> has it's own advantages. >>>> >>>> Philip >>>> >>>> p.s. Your point about compiling faster is off base. Not because you're >>>> wrong, but because if you're trying to optimize for compile speed at that >>>> granularity, you really don't want to be using LLVM (or just about any >>>> framework.) LLVM does not make a good first tier JIT. It's makes a great >>>> high tier JIT, but if you really really care about compile time, it is not >>>> the appropriate answer. >>>> >>>> On 03/28/2016 08:57 PM, Russell Wallace wrote: >>>> >>>> Ah! Okay then, so you are saying something substantive that I think I >>>> disagree with, but that could be because there are relevant issues I don't >>>> understand. >>>> >>>> My reasoning is, I've already got a pointer to the function I want the >>>> generated code to call, so I just supply that pointer, it looks ugly on a >>>> microscopic scale because there are a couple of lines of casts to shepherd >>>> it through the type system, but on an even slightly larger scale it's as >>>> simple and elegant as it gets: I supply a 64-bit machine word that gets >>>> compiled into the object code, there are no extra layers of machinery to go >>>> wrong, no nonlocal effects to understand, no having to wonder about whether >>>> anything depends on symbol information that might vary between debug and >>>> release builds or between the vendor compiler and clang or between Windows >>>> and Linux; if it works once, it will work the same way every time. As a >>>> bonus, it will compile slightly faster. >>>> >>>> Keeping things symbolic - you're saying the advantage is the >>>> intermediate code is easier to debug because a dump will say 'call print' >>>> instead of 'call function at 123456'? I can see that being a real advantage >>>> but as of right now it doesn't jump out at me as necessarily outweighing >>>> the advantages of the direct pointer method. >>>> >>>> >>>> >>>> On Tue, Mar 29, 2016 at 4:37 AM, Philip Reames < >>>> <listmail at philipreames.com>listmail at philipreames.com> wrote: >>>> >>>>> I think our use cases are actually quite similar. Part of generating >>>>> the in memory executable code is resolving all the symbolic symbols and >>>>> relocations. The details of this are mostly hidden from you by the MCJIT >>>>> interface, but it's this step I was referring to as "link time". >>>>> >>>>> The way to think of MCJIT: generate object file, incrementally link, >>>>> run dynamic loader, but do it all in memory without round tripping through >>>>> disk or explicit files. >>>>> >>>>> Philip >>>>> >>>>> >>>>> >>>>> On Mar 28, 2016, at 7:25 PM, Russell Wallace < >>>>> russell.wallace at gmail.com> wrote: >>>>> >>>>> Right, but when you say link time, the JIT compiler I'm writing works >>>>> the way openJDK or v8 do, it reads a script, JIT compiles it into memory >>>>> and runs the code in memory without ever writing anything to disk (an >>>>> option for ahead of time compilation may come later, but that will be a >>>>> while down the road), so we might be doing different things? >>>>> >>>>> On Tue, Mar 29, 2016 at 2:59 AM, Philip Reames < >>>>> <listmail at philipreames.com>listmail at philipreames.com> wrote: >>>>> >>>>>> The option we use is to have a custom memory manager, override the >>>>>> getPointerToNamedFunction function, and provide the pointer to the external >>>>>> function at link time. The inttoptr scheme works fairly well, but it does >>>>>> make for some pretty ugly and sometimes hard to analyze IR. I recommend >>>>>> leaving everything symbolic until link time if you can. >>>>>> >>>>>> Philip >>>>>> >>>>>> >>>>>> On 03/28/2016 06:33 PM, Russell Wallace via llvm-dev wrote: >>>>>> >>>>>> That seems to work, thanks! The specific code I ended up with to call >>>>>> int64_t print(int64_t) looks like: >>>>>> >>>>>> auto f = builder.CreateIntToPtr( >>>>>> ConstantInt::get(builder.getInt64Ty(), uintptr_t(print)), >>>>>> PointerType::getUnqual(FunctionType::get( >>>>>> builder.getInt64Ty(), {builder.getInt64Ty()}, >>>>>> false))); >>>>>> return builder.CreateCall(f, args); >>>>>> >>>>>> >>>>>> On Mon, Mar 28, 2016 at 1:40 PM, Caldarale, Charles R < >>>>>> <Chuck.Caldarale at unisys.com>Chuck.Caldarale at unisys.com> wrote: >>>>>> >>>>>>> > From: llvm-dev [mailto: <llvm-dev-bounces at lists.llvm.org> >>>>>>> llvm-dev-bounces at lists.llvm.org] >>>>>>> > On Behalf Of Russell Wallace via llvm-dev >>>>>>> > Subject: [llvm-dev] JIT compiler and calls to existing functions >>>>>>> >>>>>>> > In the context of a JIT compiler, what's the recommended way to >>>>>>> generate a call to an >>>>>>> > existing function, that is, not one that you are generating >>>>>>> on-the-fly with LLVM, but >>>>>>> > one that's already linked into your program? For example the >>>>>>> cosine function (from the >>>>>>> > standard math library); the Kaleidoscope tutorial recommends >>>>>>> looking it up by name with >>>>>>> > dlsym("cos"), but it seems to me that it should be possible to use >>>>>>> a more efficient and >>>>>>> > portable solution that takes advantage of the fact that you >>>>>>> already have an actual pointer >>>>>>> > to cos, even if you haven't linked with debugging symbols. >>>>>>> >>>>>>> Perhaps not the most elegant, but we simply use the >>>>>>> IRBuilder.CreateIntToPtr() method to construct the Callee argument for >>>>>>> IRBuilder.CreateCall(). The first argument for CreateIntToPtr() comes from >>>>>>> ConstantInt::get(I64, uintptr_t(ptr)), while the second is a function type >>>>>>> pointer defined by using PointerType::get() on the result of >>>>>>> FunctionType::get() with the appropriate function signature. >>>>>>> >>>>>>> - Chuck >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160330/db6f7218/attachment.html>
Philip Reames via llvm-dev
2016-Mar-30 15:08 UTC
[llvm-dev] JIT compiler and calls to existing functions
We use an explicit relocation step to deal with this. We generate code into a temporary memory location, then relocate it into a reserved area of memory which is always within a small relative offset of other interesting code. This allows us to get pc relative calls. Philip On 03/30/2016 05:53 AM, Matt Godbolt wrote:> For what it's worth we did a similar thing, but > overrode RTDyldMemoryManager directly This allowed us to control where > the RAM was allocated too (e.g. guarantee it was in the low 4GB so we > could use small memory model and avoid the mov rax, xxxxxxx; call rax > code generated for x86)*, and also override findSymbol() to have the > same behaviour as described in 4). > > --matt > > * later issues in not always being able to allocate low ram caused us > to drop this feature. I wish we could fix the "call rax"es :) > > On Tue, Mar 29, 2016 at 6:51 PM Philip Reames via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > There is no documentation I know of. Rough sketch: > 1) Create a subclass of SectionMemoryManager > 2) Create an instance of this class and add it the EngineBuilder > via setMCJITMemoryManager > (Make sure everything runs without changes) > 3) Override the getSymbolAddress method, have your implementation > call the base classes impl > (Make sure everything runs without changes) > 4) Add handling to map name->address for any custom functions you > want. > > > > On 03/28/2016 09:18 PM, Russell Wallace wrote: >> True, I care more about how fast the code runs than how long it >> takes to compile it. So if the symbolic approach enables better >> code generation, that is a very significant advantage from my >> perspective. >> >> Is there any example code or documentation you can point to for >> details about how to implement the symbolic approach? Is it >> similar to any of the versions of Kaleidoscope or any other >> extant tutorial? >> >> On Tue, Mar 29, 2016 at 5:07 AM, Philip Reames >> <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote: >> >> Other advantages of keeping things symbolic: >> 1) You can use function attributes to provide optimization or >> semantic information. >> 2) Linking modules works as expected when one of them >> contains the definition. >> 3) You can get better code generation (i.e. pc relative >> addressing for local symbols, etc..) >> >> If the inttoptr scheme makes you happy, go for it. I'm not >> telling you its wrong, merely that there's another approach >> you should consider which has it's own advantages. >> >> Philip >> >> p.s. Your point about compiling faster is off base. Not >> because you're wrong, but because if you're trying to >> optimize for compile speed at that granularity, you really >> don't want to be using LLVM (or just about any framework.) >> LLVM does not make a good first tier JIT. It's makes a great >> high tier JIT, but if you really really care about compile >> time, it is not the appropriate answer. >> >> On 03/28/2016 08:57 PM, Russell Wallace wrote: >>> Ah! Okay then, so you are saying something substantive that >>> I think I disagree with, but that could be because there are >>> relevant issues I don't understand. >>> >>> My reasoning is, I've already got a pointer to the function >>> I want the generated code to call, so I just supply that >>> pointer, it looks ugly on a microscopic scale because there >>> are a couple of lines of casts to shepherd it through the >>> type system, but on an even slightly larger scale it's as >>> simple and elegant as it gets: I supply a 64-bit machine >>> word that gets compiled into the object code, there are no >>> extra layers of machinery to go wrong, no nonlocal effects >>> to understand, no having to wonder about whether anything >>> depends on symbol information that might vary between debug >>> and release builds or between the vendor compiler and clang >>> or between Windows and Linux; if it works once, it will work >>> the same way every time. As a bonus, it will compile >>> slightly faster. >>> >>> Keeping things symbolic - you're saying the advantage is the >>> intermediate code is easier to debug because a dump will say >>> 'call print' instead of 'call function at 123456'? I can see >>> that being a real advantage but as of right now it doesn't >>> jump out at me as necessarily outweighing the advantages of >>> the direct pointer method. >>> >>> >>> >>> On Tue, Mar 29, 2016 at 4:37 AM, Philip Reames >>> <listmail at philipreames.com >>> <mailto:listmail at philipreames.com>> wrote: >>> >>> I think our use cases are actually quite similar. Part >>> of generating the in memory executable code is resolving >>> all the symbolic symbols and relocations. The details of >>> this are mostly hidden from you by the MCJIT interface, >>> but it's this step I was referring to as "link time". >>> >>> The way to think of MCJIT: generate object file, >>> incrementally link, run dynamic loader, but do it all in >>> memory without round tripping through disk or explicit >>> files. >>> >>> Philip >>> >>> >>> >>> On Mar 28, 2016, at 7:25 PM, Russell Wallace >>> <russell.wallace at gmail.com >>> <mailto:russell.wallace at gmail.com>> wrote: >>> >>>> Right, but when you say link time, the JIT compiler I'm >>>> writing works the way openJDK or v8 do, it reads a >>>> script, JIT compiles it into memory and runs the code >>>> in memory without ever writing anything to disk (an >>>> option for ahead of time compilation may come later, >>>> but that will be a while down the road), so we might be >>>> doing different things? >>>> >>>> On Tue, Mar 29, 2016 at 2:59 AM, Philip Reames >>>> <listmail at philipreames.com >>>> <mailto:listmail at philipreames.com>> wrote: >>>> >>>> The option we use is to have a custom memory >>>> manager, override the getPointerToNamedFunction >>>> function, and provide the pointer to the external >>>> function at link time. The inttoptr scheme works >>>> fairly well, but it does make for some pretty ugly >>>> and sometimes hard to analyze IR. I recommend >>>> leaving everything symbolic until link time if you can. >>>> >>>> Philip >>>> >>>> >>>> On 03/28/2016 06:33 PM, Russell Wallace via >>>> llvm-dev wrote: >>>>> That seems to work, thanks! The specific code I >>>>> ended up with to call int64_t print(int64_t) looks >>>>> like: >>>>> >>>>> auto f = builder.CreateIntToPtr( >>>>> ConstantInt::get(builder.getInt64Ty(), >>>>> uintptr_t(print)), >>>>> PointerType::getUnqual(FunctionType::get( >>>>> builder.getInt64Ty(), {builder.getInt64Ty()}, >>>>> false))); >>>>> return builder.CreateCall(f, args); >>>>> >>>>> >>>>> On Mon, Mar 28, 2016 at 1:40 PM, Caldarale, >>>>> Charles R <Chuck.Caldarale at unisys.com >>>>> <mailto:Chuck.Caldarale at unisys.com>> wrote: >>>>> >>>>> > From: llvm-dev >>>>> [mailto:llvm-dev-bounces at lists.llvm.org >>>>> <mailto:llvm-dev-bounces at lists.llvm.org>] >>>>> > On Behalf Of Russell Wallace via llvm-dev >>>>> > Subject: [llvm-dev] JIT compiler and calls >>>>> to existing functions >>>>> >>>>> > In the context of a JIT compiler, what's the >>>>> recommended way to generate a call to an >>>>> > existing function, that is, not one that you >>>>> are generating on-the-fly with LLVM, but >>>>> > one that's already linked into your program? >>>>> For example the cosine function (from the >>>>> > standard math library); the Kaleidoscope >>>>> tutorial recommends looking it up by name with >>>>> > dlsym("cos"), but it seems to me that it >>>>> should be possible to use a more efficient and >>>>> > portable solution that takes advantage of >>>>> the fact that you already have an actual pointer >>>>> > to cos, even if you haven't linked with >>>>> debugging symbols. >>>>> >>>>> Perhaps not the most elegant, but we simply >>>>> use the IRBuilder.CreateIntToPtr() method to >>>>> construct the Callee argument for >>>>> IRBuilder.CreateCall(). The first argument for >>>>> CreateIntToPtr() comes from >>>>> ConstantInt::get(I64, uintptr_t(ptr)), while >>>>> the second is a function type pointer defined >>>>> by using PointerType::get() on the result of >>>>> FunctionType::get() with the appropriate >>>>> function signature. >>>>> >>>>> - Chuck >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> <mailto:llvm-dev at lists.llvm.org> >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> >>> >> >> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160330/f3e14394/attachment-0001.html>
Lang Hames via llvm-dev
2016-Apr-04 18:05 UTC
[llvm-dev] JIT compiler and calls to existing functions
Hi All, For reasons already covered by Philip and others I prefer symbolic resolution, rather than baking pointers into the IR. If you go that route and you're using ORC you just need to add a line to your symbol resolver: auto Resolver = createLambdaResolver( [&](const std::string &Name) { if (auto Sym = findMangledSymbol(Name)) return RuntimeDyld::SymbolInfo(Sym.getAddress(), Sym.getFlags()); if (Name == "sin") return RuntimeDyld::SymbolInfo( static_cast<uint64_t>(reinterpret_cast<uintptr_t>(&sin), JITSymbolFlags::Exported); return RuntimeDyld::SymbolInfo(nullptr); }, [](const std::string &S) { return nullptr; }); If you have a large number of in-process symbols you'd like to supply you can substitute a StringMap lookup. See llvm/examples/Kaleidoscope/include/KaleidoscopeJIT.h. Cheers, Lang. On Wed, Mar 30, 2016 at 8:08 AM, Philip Reames via llvm-dev < llvm-dev at lists.llvm.org> wrote:> We use an explicit relocation step to deal with this. We generate code > into a temporary memory location, then relocate it into a reserved area of > memory which is always within a small relative offset of other interesting > code. This allows us to get pc relative calls. > > Philip > > > On 03/30/2016 05:53 AM, Matt Godbolt wrote: > > For what it's worth we did a similar thing, but > overrode RTDyldMemoryManager directly This allowed us to control where the > RAM was allocated too (e.g. guarantee it was in the low 4GB so we could use > small memory model and avoid the mov rax, xxxxxxx; call rax code generated > for x86)*, and also override findSymbol() to have the same behaviour as > described in 4). > > --matt > > * later issues in not always being able to allocate low ram caused us to > drop this feature. I wish we could fix the "call rax"es :) > > On Tue, Mar 29, 2016 at 6:51 PM Philip Reames via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> There is no documentation I know of. Rough sketch: >> 1) Create a subclass of SectionMemoryManager >> 2) Create an instance of this class and add it the EngineBuilder via >> setMCJITMemoryManager >> (Make sure everything runs without changes) >> 3) Override the getSymbolAddress method, have your implementation call >> the base classes impl >> (Make sure everything runs without changes) >> 4) Add handling to map name->address for any custom functions you want. >> >> >> >> On 03/28/2016 09:18 PM, Russell Wallace wrote: >> >> True, I care more about how fast the code runs than how long it takes to >> compile it. So if the symbolic approach enables better code generation, >> that is a very significant advantage from my perspective. >> >> Is there any example code or documentation you can point to for details >> about how to implement the symbolic approach? Is it similar to any of the >> versions of Kaleidoscope or any other extant tutorial? >> >> On Tue, Mar 29, 2016 at 5:07 AM, Philip Reames < >> <listmail at philipreames.com>listmail at philipreames.com> wrote: >> >>> Other advantages of keeping things symbolic: >>> 1) You can use function attributes to provide optimization or semantic >>> information. >>> 2) Linking modules works as expected when one of them contains the >>> definition. >>> 3) You can get better code generation (i.e. pc relative addressing for >>> local symbols, etc..) >>> >>> If the inttoptr scheme makes you happy, go for it. I'm not telling you >>> its wrong, merely that there's another approach you should consider which >>> has it's own advantages. >>> >>> Philip >>> >>> p.s. Your point about compiling faster is off base. Not because you're >>> wrong, but because if you're trying to optimize for compile speed at that >>> granularity, you really don't want to be using LLVM (or just about any >>> framework.) LLVM does not make a good first tier JIT. It's makes a great >>> high tier JIT, but if you really really care about compile time, it is not >>> the appropriate answer. >>> >>> On 03/28/2016 08:57 PM, Russell Wallace wrote: >>> >>> Ah! Okay then, so you are saying something substantive that I think I >>> disagree with, but that could be because there are relevant issues I don't >>> understand. >>> >>> My reasoning is, I've already got a pointer to the function I want the >>> generated code to call, so I just supply that pointer, it looks ugly on a >>> microscopic scale because there are a couple of lines of casts to shepherd >>> it through the type system, but on an even slightly larger scale it's as >>> simple and elegant as it gets: I supply a 64-bit machine word that gets >>> compiled into the object code, there are no extra layers of machinery to go >>> wrong, no nonlocal effects to understand, no having to wonder about whether >>> anything depends on symbol information that might vary between debug and >>> release builds or between the vendor compiler and clang or between Windows >>> and Linux; if it works once, it will work the same way every time. As a >>> bonus, it will compile slightly faster. >>> >>> Keeping things symbolic - you're saying the advantage is the >>> intermediate code is easier to debug because a dump will say 'call print' >>> instead of 'call function at 123456'? I can see that being a real advantage >>> but as of right now it doesn't jump out at me as necessarily outweighing >>> the advantages of the direct pointer method. >>> >>> >>> >>> On Tue, Mar 29, 2016 at 4:37 AM, Philip Reames < >>> <listmail at philipreames.com>listmail at philipreames.com> wrote: >>> >>>> I think our use cases are actually quite similar. Part of generating >>>> the in memory executable code is resolving all the symbolic symbols and >>>> relocations. The details of this are mostly hidden from you by the MCJIT >>>> interface, but it's this step I was referring to as "link time". >>>> >>>> The way to think of MCJIT: generate object file, incrementally link, >>>> run dynamic loader, but do it all in memory without round tripping through >>>> disk or explicit files. >>>> >>>> Philip >>>> >>>> >>>> >>>> On Mar 28, 2016, at 7:25 PM, Russell Wallace < >>>> <russell.wallace at gmail.com>russell.wallace at gmail.com> wrote: >>>> >>>> Right, but when you say link time, the JIT compiler I'm writing works >>>> the way openJDK or v8 do, it reads a script, JIT compiles it into memory >>>> and runs the code in memory without ever writing anything to disk (an >>>> option for ahead of time compilation may come later, but that will be a >>>> while down the road), so we might be doing different things? >>>> >>>> On Tue, Mar 29, 2016 at 2:59 AM, Philip Reames < >>>> <listmail at philipreames.com>listmail at philipreames.com> wrote: >>>> >>>>> The option we use is to have a custom memory manager, override the >>>>> getPointerToNamedFunction function, and provide the pointer to the external >>>>> function at link time. The inttoptr scheme works fairly well, but it does >>>>> make for some pretty ugly and sometimes hard to analyze IR. I recommend >>>>> leaving everything symbolic until link time if you can. >>>>> >>>>> Philip >>>>> >>>>> >>>>> On 03/28/2016 06:33 PM, Russell Wallace via llvm-dev wrote: >>>>> >>>>> That seems to work, thanks! The specific code I ended up with to call >>>>> int64_t print(int64_t) looks like: >>>>> >>>>> auto f = builder.CreateIntToPtr( >>>>> ConstantInt::get(builder.getInt64Ty(), uintptr_t(print)), >>>>> PointerType::getUnqual(FunctionType::get( >>>>> builder.getInt64Ty(), {builder.getInt64Ty()}, false))); >>>>> return builder.CreateCall(f, args); >>>>> >>>>> >>>>> On Mon, Mar 28, 2016 at 1:40 PM, Caldarale, Charles R < >>>>> <Chuck.Caldarale at unisys.com>Chuck.Caldarale at unisys.com> wrote: >>>>> >>>>>> > From: llvm-dev [mailto: <llvm-dev-bounces at lists.llvm.org> >>>>>> llvm-dev-bounces at lists.llvm.org] >>>>>> > On Behalf Of Russell Wallace via llvm-dev >>>>>> > Subject: [llvm-dev] JIT compiler and calls to existing functions >>>>>> >>>>>> > In the context of a JIT compiler, what's the recommended way to >>>>>> generate a call to an >>>>>> > existing function, that is, not one that you are generating >>>>>> on-the-fly with LLVM, but >>>>>> > one that's already linked into your program? For example the cosine >>>>>> function (from the >>>>>> > standard math library); the Kaleidoscope tutorial recommends >>>>>> looking it up by name with >>>>>> > dlsym("cos"), but it seems to me that it should be possible to use >>>>>> a more efficient and >>>>>> > portable solution that takes advantage of the fact that you already >>>>>> have an actual pointer >>>>>> > to cos, even if you haven't linked with debugging symbols. >>>>>> >>>>>> Perhaps not the most elegant, but we simply use the >>>>>> IRBuilder.CreateIntToPtr() method to construct the Callee argument for >>>>>> IRBuilder.CreateCall(). The first argument for CreateIntToPtr() comes from >>>>>> ConstantInt::get(I64, uintptr_t(ptr)), while the second is a function type >>>>>> pointer defined by using PointerType::get() on the result of >>>>>> FunctionType::get() with the appropriate function signature. >>>>>> >>>>>> - Chuck >>>>>> >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >>>>> >>>>> >>>> >>> >>> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160404/13578a18/attachment.html>