Hi, 1. A small question: How do I ensure memory alignment? I want all malloced memory, globals and functions to be 4-byte aligned. Does llvm have any ".align" keyword? I'm currently implementing a small scheme toy-compiler, and want to use the lowest 2 bits for type tags. It's Currently 380 lines of scheme-code[1], quite similar to the compiler in SICP[2], which I hope to get self-applicable later on. 2. Can I change the calling conventions / frame handling, so that call frames are allocated on the heap instead of on the stack? Right now all my compiled functions take an environment as an argument to lookup variables in the scheme-function. It would perhaps be nicer if I could use the call frames instead, but I can't since lambdas in it can escape when the frame is popped of the stack, for example: (lambda (x) (lambda (z) (+ x z))) ; the inner lambda is returned. Regards, Tobias [1] www.ida.liu.se/~tobnu/compile.ss Can currently for example compile and run: (compiler '((lambda (x y z) (if (seteq (car (cdr (cons x (cons y 3)))) z) (add 1 0) (sub 2 1))) 1 2 2)) [2] Structure and Interpretation of Computer Programs, Abelson & Sussman.
Tobias Nurmiranta wrote:> Hi,Chris and others can give you better ideas on the ideal ways to implement what you want, but I'll give some ideas/answers for now.> > 1. > > A small question: How do I ensure memory alignment? I want all malloced > memory, globals and functions to be 4-byte aligned. Does llvm have any > ".align" keyword?No, LLVM does not currently have an alignment keyword (that I know of). It might be useful to have one, though.> > I'm currently implementing a small scheme toy-compiler, and want to use > the lowest 2 bits for type tags. It's Currently 380 lines of > scheme-code[1], quite similar to the compiler in SICP[2], which I hope to > get self-applicable later on.I think the only reliable way you could do this would be to implement your own memory allocator function that returned memory from the heap. Your code could ensure that every pointer it returned was on a 4 byte boundary. Any other technique would take advantage of side-effects of the LLVM code generators.> > 2. > > Can I change the calling conventions / frame handling, so that call frames > are allocated on the heap instead of on the stack? Right now all my > compiled functions take an environment as an argument to lookup variables > in the scheme-function. It would perhaps be nicer if I could use the call > frames instead, but I can't since lambdas in it can escape when the > frame is popped of the stack, for example:I suppose you could change the LLVM code generators to use whatever calling conventions you want, but I believe your current implementation is the correct way of passing information from a function back to its caller. Changing the code generators to do what you describe essentially "breaks" the LLVM calling convention model (i.e. in LLVM, items on the stack disappear when a function returns; you do not want to assume that something on the function's stack still exists after the function already returns). Passing a pointer may look a little kludgy, but it is correct and should be fine. -- John T.> > (lambda (x) (lambda (z) (+ x z))) ; the inner lambda is returned. > > Regards, Tobias > > [1] www.ida.liu.se/~tobnu/compile.ss > > Can currently for example compile and run: > > (compiler '((lambda (x y z) > (if (seteq (car (cdr (cons x (cons y 3)))) z) > (add 1 0) > (sub 2 1))) > 1 2 2)) > > [2] Structure and Interpretation of Computer Programs, Abelson & Sussman. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev-- ********************************************************************* * John T. Criswell Email: criswell at uiuc.edu * * Research Programmer * * University of Illinois at Urbana-Champaign * * * * "It's today!" said Piglet. "My favorite day," said Pooh. * *********************************************************************
On Mon, 14 Jun 2004, Tobias Nurmiranta wrote:> A small question: How do I ensure memory alignment? I want all malloced > memory, globals and functions to be 4-byte aligned. Does llvm have any > ".align" keyword?In the medium term, we plan to add alignment requirements to the alloca/malloc instructions and to globals (vars/functions) but we do not have this yet. Currently the code generator defines the alignment of various structures based on the preferred alignment (usually specified by the platform ABI). If you have a 32-bit object (like an int or a float), you can be pretty certain that the object is going to be four byte aligned (for global vars, allocas, and mallocs). Functions SHOULD be at least four byte aligned: if they are not, please file a bug and we'll get it fixed ASAP!> I'm currently implementing a small scheme toy-compiler, and want to use > the lowest 2 bits for type tags. It's Currently 380 lines of > scheme-code[1], quite similar to the compiler in SICP[2], which I hope to > get self-applicable later on.Cool! I'm currently out of town so I can't try it out, but I will when I get a chance. This is sounds like a neat project!> Can I change the calling conventions / frame handling, so that call frames > are allocated on the heap instead of on the stack? Right now all my > compiled functions take an environment as an argument to lookup variables > in the scheme-function. It would perhaps be nicer if I could use the call > frames instead, but I can't since lambdas in it can escape when the > frame is popped of the stack, for example:I think that taking an environment pointer is the best way to go. The semantics of LLVM are supposed to match that of a microprocessor, so if you want custom semantics for calls (allocating the frame on the stack), they should be implemented explicitly in the LLVM code. If the code generated by the result is not good, please let us know and we can tune the code generator or potentially add a new domain-specific optimization. One important thing that we don't have (but which will be added when there is interest) is support for explicitly marked tail calls. Currently there is support for tail call *optimizations* (e.g., turning a naive pow into a loop), but no way for a front-end to guarantee that it happens. We will eventually allow the front-end to mark a call as being a tail call, but noone has implemented this yet (it shouldn't be hard). In any case, the optimizer is pretty agressive about eliminating tail calls, so you probably won't run into problems except for obsurd situations. Writing a scheme front-end for LLVM sounds like a great project: please keep us informed how it goes, and when it gets mostly functional, let us know so we can add a link on the web site. :) -Chris -- http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/
On Mon, 14 Jun 2004, John Criswell wrote:> > I'm currently implementing a small scheme toy-compiler, and want to use > > the lowest 2 bits for type tags. It's Currently 380 lines of > > scheme-code[1], quite similar to the compiler in SICP[2], which I hope to > > get self-applicable later on. > > I think the only reliable way you could do this would be to implement > your own memory allocator function that returned memory from the heap. > Your code could ensure that every pointer it returned was on a 4 byte > boundary.I just wanted to add that the llvm "malloc" instruction turns into a direct call to the libc malloc implementation. All libc implementations that I know of guarantee at least a 4-byte alignment, and many provide 8 byte. If you malloc your data, you should be fine. -Chris -- http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/
Hi, now I've had some free coding time. On Mon, 14 Jun 2004, Chris Lattner wrote:> Writing a scheme front-end for LLVM sounds like a great project: please > keep us informed how it goes, and when it gets mostly functional, let us > know so we can add a link on the web site. :) > > -ChrisJust to keep you informed. My small scheme compiler[1] of 1K lines is now self applicable, with the types fixnum, symbols, strings, functions and vectors (cons cells are seen as vectors of size 2). You can for example do: cat compile.ss|mzscheme --script compile.ss|llvm-as -o=ccomp.bc echo '(display "hello")'|lli ccomp.bc|llvm-as -o=hello2.bc But be warned, the resulting programs are painfully slow :). Next step is to implement garbage collection for it, since it right now just joyfully mallocs away :). (It actually runs out of memory if I try to compile compile.ss with ccomp.bc, i.e "cat compile.ss|lli ccomp.bc".) A question: would it be difficult to make my compiled compiler (ccomp.bc) call functions in llvm for creation of basic blocks and instructions, instead of using text format? (See under "LLVM primitives" in the scheme code.). I'll try to read up on how to use the JIT facilities, but I won't say no to any hints :). , Tobias [1] http://www.ida.liu.se/~tobnu/compile.ss