I'm losing my sanity, so I thought I'd try and generate an LLVM target for the Glasgow Haskell Compiler (GHC). In talking to some of the people in the GHC mailing list some issues have come up that I can't find a ready answer to. (Others came up that I could, so I don't feel quite as stupid or helpless as I could.) 1. Is there any way to hint that a global pointer variable should be put in a register when further translated to some native form of code? This is specifically necessary, apparently, for the current stack and heap pointers in GHC for speed reasons. 2. This I'm just going to quote: "With (almost) every chunk of code, we want to associate a smal chunk of data (the "info table") which contains information used by the garbage collector and other parts of the run time system. We want to use only one pointer to point to both of those things, and we don't want to waste time with any additional indirections, so we make the pointer point between the data chunk and the code chunk." I've asked for clarification on this from the original author, but I suspect it's something like having a pointer to code base that uses negative offsets to associated data (like some C++ implementations use for the vtable). 3. What is the actual support for tail calls? The docs are pretty vague on this point -- the "tail" keyword seems to "enable" tail call optimisation. I'm not sure what that means. 4. GHC relies an awful lot on having many, many, many "little stacks". Is it possible to have lots of tiny stacks in LLVM's architecture, to switch between them easily, to check when they overflow and to automatically grow them when they exceed their bounds? -- Michael T. Richter Email: ttmrichter at gmail.com, mtr1966 at hotpop.com MSN: ttmrichter at hotmail.com, mtr1966 at hotmail.com; YIM: michael_richter_1966; AIM: YanJiahua1966; ICQ: 241960658; Jabber: mtr1966 at jabber.cn "To [the Chinese], all other people are barbarians." --The Dalai Lama -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20061221/864af72c/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 191 bytes Desc: This is a digitally signed message part URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20061221/864af72c/attachment.sig>
Hi, Michael T. Richter wrote:> 4. GHC relies an awful lot on having many, many, many "little stacks". > Is it possible to have lots of tiny stacks in LLVM's architecture, to > switch between them easily, to check when they overflow and to > automatically grow them when they exceed their bounds?I was also wondering about this from the point of view of using channels to communicate between tasks, where only one task runs at a time and it runs until it blocks. (Yes, I'm being influenced by CSP and Bell Labs' Squeak, Newsqueak, Alef, Limbo, etc.) Cheers, Ralph.
Hello Michael, On Thu, 2006-12-21 at 20:23 +0800, Michael T. Richter wrote:> I'm losing my sanity, so I thought I'd try and generate an LLVM target > for the Glasgow Haskell Compiler (GHC). In talking to some of the > people in the GHC mailing list some issues have come up that I can't > find a ready answer to. (Others came up that I could, so I don't feel > quite as stupid or helpless as I could.) > > 1. Is there any way to hint that a global pointer variable should > be put in a register when further translated to some native > form of code? This is specifically necessary, apparently, for > the current stack and heap pointers in GHC for speed reasons.I don't think so.> 1. This I'm just going to quote: "With (almost) every chunk of > code, we want to associate a smal chunk of data (the "info > table") which contains information used by the garbage > collector and other parts of the run time system. We want to > use only one pointer to point to both of those things, and we > don't want to waste time with any additional indirections, so > we make the pointer point between the data chunk and the code > chunk." I've asked for clarification on this from the > original author, but I suspect it's something like having a > pointer to code base that uses negative offsets to associated > data (like some C++ implementations use for the vtable).Okay. I'm not sure what the question is. Certainly LLVM can handle vtable type things. If this is only for garbage collection, you might want to consider implementing the garbage collection interface in LLVM. Please see: http://llvm.org/docs/GarbageCollection.html> 1. What is the actual support for tail calls? The docs are > pretty vague on this point -- the "tail" keyword seems to > "enable" tail call optimisation. I'm not sure what that > means.Function call instructions can have a "tail" option which enables tail call optimization. If you write: tail call void %foo () then you will get a tail call. Without the "tail" keyword, you won't.> 1. GHC relies an awful lot on having many, many, many "little > stacks". Is it possible to have lots of tiny stacks in LLVM's > architecture, to switch between them easily, to check when > they overflow and to automatically grow them when they exceed > their bounds?I don't think so, but things could be done in the code generator to make that happen. I believe the code generators assume there's one execution stack. However, I'm not the best person to answer this question so perhaps someone else will chime in. Reid.> > -- > Michael T. Richter > Email: ttmrichter at gmail.com, mtr1966 at hotpop.com > MSN: ttmrichter at hotmail.com, mtr1966 at hotmail.com; YIM: > michael_richter_1966; AIM: YanJiahua1966; ICQ: 241960658; Jabber: > mtr1966 at jabber.cn > > "To [the Chinese], all other people are barbarians." --The Dalai Lama > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Thu, 21 Dec 2006, Michael T. Richter wrote:> I'm losing my sanity, so I thought I'd try and generate an LLVM target > for the Glasgow Haskell Compiler (GHC). In talking to some of theAh, so you had to lose your sanity before considering llvm... I see. :) :)> people in the GHC mailing list some issues have come up that I can't > find a ready answer to. (Others came up that I could, so I don't feel > quite as stupid or helpless as I could.)Ok,> 1. Is there any way to hint that a global pointer variable should > be put in a register when further translated to some native form > of code? This is specifically necessary, apparently, for the > current stack and heap pointers in GHC for speed reasons.llvm-gcc supports the GCC 'asm("%ebx")' syntax for global variables. Note that, like GCC, LLVM only makes guarantees about the contents of the register during inline asm calls, so this may not be sufficient for you. If this isn't sufficient, it may make sense to do a simple extension to LLVM to provide a capability to do this sort of thing. However, you should do some performance analysis to determine if this is needed, LLVM produces very different code than GCC, so you may not need to do this.> 2. This I'm just going to quote: "With (almost) every chunk of > code, we want to associate a smal chunk of data (the "info > table") which contains information used by the garbage collector > and other parts of the run time system. We want to use only one > pointer to point to both of those things, and we don't want to > waste time with any additional indirections, so we make the > pointer point between the data chunk and the code chunk." I've > asked for clarification on this from the original author, but I > suspect it's something like having a pointer to code base that > uses negative offsets to associated data (like some C++ > implementations use for the vtable).C++ v-tables aren't typically implemented this way, AFAIK. However, it should be easy to add an intrinsic to capture this and do the right thing.> 3. What is the actual support for tail calls? The docs are pretty > vague on this point -- the "tail" keyword seems to "enable" tail > call optimisation. I'm not sure what that means.When a call is marked 'tail', the compiler is allowed to assume that the callee does not access the caller's stack. Thus, if marked 'tail', in the tail position (right before a return), and marked 'fastcc' (so arguments can be passed normal order instead of reverse order), the code generator can deallocate the stack before calling the callee. AFAIK, the X86 backend is currently the only one that supports this, and you have to pass a special option to get it to do so. It should be straight-forward to extend other targets to support this, but there has been little demand so far.> 4. GHC relies an awful lot on having many, many, many "little > stacks". Is it possible to have lots of tiny stacks in LLVM's > architecture, to switch between them easily, to check when they > overflow and to automatically grow them when they exceed their > bounds?It depends on what you mean. There are multiple ways to achieve this. The simplest way is to use CPS: http://nondot.org/sabre/LLVMNotes/ExplicitlyManagedStackFrames.txt -Chris -- http://nondot.org/sabre/ http://llvm.org/