Hi, would like to use LLVM as backend for a compiler. One of the features I would like to implement is segment based addressing for position independent data. For some it may sound strange, for others the opposite. No need to write complex story. Imagine you have a custom alocator that manages an area of 1GB of memory. Your application uses a custom allocator to allocate memory inside this area, and at the end of your code you save one to one that memory to disk. Next time you load that one 1GB where you can (addresses that are available) and by using segment based addressing all pointers inside would be valid independently where you loaded the 1GB. Now on x64 I have GS/FS registers. Pitty enough their addresses can be changed only by the OS (not in user space). Not sure what "tools" are available on ARM, hopefully there is something. New my question is, what is the best way to tell LLVM to generate [FS:xxx] and/or [GS:xxxx] class of instructions? thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141205/21753a60/attachment.html>
Tim Northover
2014-Dec-05 17:28 UTC
[LLVMdev] instruction/intrinsic for segmented adressing
Hi,> Now on x64 I have GS/FS registers. Pitty enough their addresses can be > changed only by the OS (not in user space). Not sure what "tools" are > available on ARM, hopefully there is something.There are no segment registers on ARM. AArch64 has a couple of thread pointer registers that might be abused for the purpose (one even writable from user-space). AArch32 only has one, I believe, which is usually claimed by the OS for threads.> New my question is, what is the best way to tell LLVM to generate [FS:xxx] > and/or [GS:xxxx] class of instructions?On x86, the addrspace(N) property of pointers triggers use of segment registers (256 => gs, 257 => fs by the looks of it). E.g. define i32 @foo(i32 addrspace(256)* %addr) { %val = load i32 addrspace(256)* %addr ret i32 %val } But as in the ARM case, this is usually the mechanism used for thread-local storage, so be careful. Cheers. Tim.
Indirectly related. Just discovered that clang has a non-standard attribute, but that is X86-64 specific. Would be interesting to find a cross-platform solution and implement it both in clang and llvm... the attribute in question is translated to addrspace(256).. http://clang.llvm.org/docs/LanguageExtensions.html#non-standard-c-11-attributes -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141205/ccc9adea/attachment.html>
Tim Northover
2014-Dec-05 17:55 UTC
[LLVMdev] instruction/intrinsic for segmented adressing
> Not so disappointing the information you are telling me about AArch64. You > get the point, do not care if they are called segment registers, or > MyProposeRegisters, important is that I have them there for the duration of > a fiber. Do you know, are they exhaustively used by linux for instance?There are two of them (readable from user space): TPIDR_EL0 and TPIDRRO_EL0. The first is exclusively claimed by Linux for TLS, the second is unused (as far as I know) but only writable by the kernel (RO == read-only). Incidentally, there's no way to directly control either of these in LLVM.> What about OSX? Anybody has any idea?OS X uses a different system for TLS at the moment, but almost certainly reserves the right to do whatever it pleases with the segment registers at a future date.> In terms of CPU time, what would be the overhead of using such "segmented"-addressing? Myself I assume almost zero. CPU cache related issues would probably not change or?Probably fairly minimal in most cases (on x86). On ARM there is definitely a cost. Cheers. Tim.
Tim Northover
2014-Dec-05 18:32 UTC
[LLVMdev] instruction/intrinsic for segmented adressing
(Adding llvmdev to CC again) On 5 December 2014 at 10:21, mobi phil <mobi at mobiphil.com> wrote:>> >> Probably fairly minimal in most cases (on x86). On ARM there is >> definitely a cost. >> > hm... why? You cannot have indexed addressing?The code that needs to be emitted is roughly: [..."segment"-offset into x1...] mrs x0, tpidr_el0 ldr xD, [x0, x1] That's a more complex addressing mode and an additional MRS instruction over the usual sequence. You also lose the ability to fold the actual address-computation into the LDR.> and now the obvious question: for aarch64, is there an adrspace(256) > identical declaration for LLVM?Nope. That's what I meant by saying there's no direct control over these features from LLVM.> I am completely lost, where and how to start the transformation. One > solution would be to modify clang code generation... but that seems to be > more complex solution and not so general solution.It's a very difficult problem. The main issue is that the stack won't be in this special address space (at least not without heavy LLVM modifications), so you need a way to distinguish stack accesses from heap. Without source annotation that's reducible to the halting problem. For example: int load_address(int *addr) { return addr; } int evil(int *heap_addr) { int local_var = 42; return load_address(rand() % 2 ? heap_addr : &local_var); } Should the code emitted for load_address use gs or not? Cheers. Tim.
Tim Northover
2014-Dec-05 18:41 UTC
[LLVMdev] instruction/intrinsic for segmented adressing
> int load_address(int *addr) { > return addr; > }Sorry, that should be "return *addr;". Tim.
Thanks again for your help!> >> > >> Probably fairly minimal in most cases (on x86). On ARM there is > >> definitely a cost. > >> > > hm... why? You cannot have indexed addressing? > What I need is a way to force > The code that needs to be emitted is roughly: > [..."segment"-offset into x1...] > mrs x0, tpidr_el0 > ldr xD, [x0, x1] > > That's a more complex addressing mode and an additional MRS > instruction over the usual sequence. You also lose the ability to fold > the actual address-computation into the LDR. >but this is the price you pay always for RISC vs. x86, or? Probably it is difficult to quantify but wonder if it would add more than 5% slowdown to an average program, especially long running server class application.> > > and now the obvious question: for aarch64, is there an adrspace(256) > > identical declaration for LLVM? > > Nope. That's what I meant by saying there's no direct control over > these features from LLVM. >wouldn't it make sense to add such an addressing instruction at LLVM IR level? I mean there were no similar requests? Do not know if there is any interest, but this would help implementing lot of stuff like pointer size compression on 64 bit (pointers would be kept as 32bit), easier data sharing between processes (mmap with segmented addressing), position independent data (load and save chunks of data with pointers, keeping pointers semantics valid). Knowing this, it means that my compiler has to generate platform dependent assembler code inside the IR. Which means I would not be able to run such a code inside LLVM virtual machine. Another solution for my problem would be to carry around the segment address as extra function parameter to all functions, but that would be a funny> > It's a very difficult problem. The main issue is that the stack won't > be in this special address space (at least not without heavy LLVM > modifications), so you need a way to distinguish stack accesses from > heap. Without source annotation that's reducible to the halting > problem. For example: > > int load_address(int *addr) { > return addr; > } > > int evil(int *heap_addr) { > int local_var = 42; > return load_address(rand() % 2 ? heap_addr : &local_var); > } >> Should the code emitted for load_address use gs or not? >the stack should not be in this address space and this addressing should not apply to stack. The framework would make any kind of C++ constructor private (friend accessible only to some Factory methods), so such objects could not be created on the stack only on heap. So I wonder if it is possible in a LLVM pass to track back all pointers in the IR that were initialized with a certain function (factory function) and change the addressing Tried to play with a naiv approach. uint8_t *global_segment; #define ainline __attribute__((always_inline)) template<class A> class CompactPointer { uint32_t adr; public: ainline A *operator->() { return reinterpret_cast<A*>(static_cast<uint32_t*>(global_segment)+adr);} }; int main() { CompactPointer<OtherObject> cpoo; CompactPointer<Object> cp = cpoo->cpo; } ~ all such dereferencing statements would have in the IR references to the global_segment. Could track back all those (with a custom LLVM pass) and translate to a "segmented instruction". Having seen that address(256) is specific to X86, could generate for both x86-64 and Aarch64 custom addressing code. The problem is that doing so I would probably break the chance that some code get optimized in other phases, if I would apply such a pass at later stages, I might not be able to find the patterns. On x86-64, unless I call some library functions I have the guaranty that nobody would change the values in the gs/fs registers. Is there a way to tell LLVM not to reserve a certain register? I wish more attention would be given to such a design pattern through all languages and platforms. Sorry if there is a bit of confusion in what I write, but I am still a bit confused as I do not know well yet LLVM and the platforms themselves, and would like to know what are my possibilities before starting to read hundreds of pages of documentation thanks in advance for the answers, rgrds, mobi phil being mobile, but including technology http://mobiphil.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141206/f1b55b73/attachment.html>