On 01/13/2010 04:08 AM, Garrison Venn wrote:> If it helps, to see what is involved, outside of a pure IR context, > see the example code, and doc at: > > http://wiki.llvm.org/HowTo:_Build_JIT_based_Exception_mechanism#Source_Code:_exceptionDemo.cppIt does, although in the "let me show you why this is too much to tackle" way.> Although this is a pure example that shows several test cases, > including foreign exception interaction, it is not an IR example, but > rather a LLVM IR API example. It would be interesting to see a pure > IR version of a personality function. I don't see why this would not > be possible, although costly in terms of effort. Clang would help.Beyond the scope of the project, I guess. Sounds too far out on the diminishing returns curve for knowledge. If I spend too much time handcoding IR the first extension to the project would be to write a "high-level IR" front-end that provides a 1-1 mapping of the semantics, but with handcoding-friendly syntax and tools. It would actually save time at some level, given that I'm manually #including headers just to reduce the amount of code duplication to saying it twice instead of many times. Complete aside--I hate when people tell me something is impossible, even me. :-) So after I said you couldn't do without CPP-style #includes a few days ago I was annoyed enough to design in my head an import/export mechanism using only unix tools everyone has laying around. Just to prove myself wrong, I guess. I'm not sure I'll implement it given that I already have a lot of code written the other way, but LLVM syntax is simple enough that it could be done without parsing the IR. I don't know if I have enough IR left to justify switching over, but it would be satisfying in principle to get rid of the duplication of headers.> There are also ways to lower your invoke/unwind into a > setjump/longjump implementation, but I do not know how to do this in > IR, as it requires function pass setup which is outside the scope of > IR.I don't know enough about how setjmp/longjmp are implemented to have a clue. If I'm getting into uncharted territory it's easier to just unwind the evaluator stack by hand, just as I already did with the parser when unwinding didn't work. The focus is on learning IR and about the simple lisp evaluation model. There are actually limits to my madness, you know. :-) It would be more profitable to learn another aspect of the system by implementing a MMIX back-end or something. Or, and I know this is just *crazy* talk, I could actually follow the intended learning path and use the main C++ API for something. :-) Dustin
On Jan 13, 2010, at 12:46, Dustin Laurence wrote:> On 01/13/2010 04:08 AM, Garrison Venn wrote: > >> If it helps, to see what is involved, outside of a pure IR context, >> see the example code, and doc at: >> >> http://wiki.llvm.org/HowTo:_Build_JIT_based_Exception_mechanism#Source_Code:_exceptionDemo.cpp > > It does, although in the "let me show you why this is too much to > tackle" way. >Yeah, I hear you. The LLVM developer fly trap got me. ;-)>> Although this is a pure example that shows several test cases, >> including foreign exception interaction, it is not an IR example, but >> rather a LLVM IR API example. It would be interesting to see a pure >> IR version of a personality function. I don't see why this would not >> be possible, although costly in terms of effort. Clang would help.> snip>> There are also ways to lower your invoke/unwind into a >> setjump/longjump implementation, but I do not know how to do this in >> IR, as it requires function pass setup which is outside the scope of >> IR. > > I don't know enough about how setjmp/longjmp are implemented to have a > clue. If I'm getting into uncharted territory it's easier to just > unwind the evaluator stack by hand, just as I already did with the > parser when unwinding didn't work. The focus is on learning IR and > about the simple lisp evaluation model. >For pedagogical purposes, the lowering is accomplished by an IR to IR graph transformation that you add to a function pass manager. I personally view LLVM as a term re-writing system where the rules are controlled by the developer a priori. The above IR to IR transformation is one of these rules, which in LLVM parlance, and from a compiler viewpoint, is a pass. See -lowerinvoke in http://llvm.org/docs/Passes.html for the command line option. See llvm::createLowerInvokePass(...) in Scalar.h; note the comments. However this kind of implementation does not do stack unwinding but rather creates the standard longjmp to a pevious setjmp behavior. This is why I thought the pursuit of the zero cost (exception setup with no throw), unwind approach was worth being caught by the venus fly trap.> There are actually limits to my madness, you know. :-) It would be more > profitable to learn another aspect of the system by implementing a MMIX > back-end or something.Funny I was thinking the same thing. Implementing MIX would be a cool way to learn the other side of LLVM (backends). I didn't even know there was a MMIX until your email forced me to query.> > Or, and I know this is just *crazy* talk, I could actually follow the > intended learning path and use the main C++ API for something. :-) >Well, even though I did not take your route, I still use the IR ref. doc as my true documentation. It is fairly isomorphic to C++ IR API. So I think your approach is worth while.> Dustin >Garrison
On 01/13/2010 02:28 PM, Garrison Venn wrote:> I > personally view LLVM as a term re-writing system where the rules are > controlled by the developer a priori.Hopefully I'll remember that comment when I understand its significance better. :-)> Funny I was thinking the same thing. Implementing MIX would be a cool > way to learn the other side of LLVM (backends).It seemed appropriate, especially since I've always been too lazy to really learn MIX and that's unfortunate when one wants to go to the source instead of read one of Knuth's interpreters. I haven't needed to do that often, but one should always have the option. Also, I have common hardware so I have no real motivation to target a real machine (the only possible reason I could see is if I wanted to buy a board and do robotics with my boy, and at five he's not ready for that yet). So doing an (M)MIX backend would have the salutary effect of making me able to read Knuth better and that's more motivating than real hardware I'm not actually using myself. Plus, a priori I'd guess that (M)MIX is very likely more consistent and less quirky than any real architecture, as it has no practical constraints or opportunities to exploit. (M)Mix would probably be a good choice for a backend-writing tutorial. I think expectations would be suitably modest--I don't think anyone is going to port the Linux kernel to MIX or anything, so presumably one wouldn't get endless requests to tweak the code gen to within an inch of its life. The existence of both MIX and MMIX could even be an advantage if both were supported, as one would have examples of both CISC and RISC style architectures.> ...I didn't even know > there was a MMIX until your email forced me to query.I guess MMIX is to MIX as x86-64 is to 16-bit x86. Hopefully that rather than it being like ia64 is to x86. :-) Of course, the usefulness of MMIX more or less depends on Knuth finishing stuff. :-)> Well, even though I did not take your route, I still use the IR ref. > doc as my true documentation. It is fairly isomorphic to C++ IR API. > So I think your approach is worth while.I hope so. Though of course I have an agenda for learning LLVM too, and if that pans out I won't be able to escape doing things normally. I do not envision writing interpreters for anything more complex than Forth or Lisp in IR. One advantage this backwards approach has is exposing more of the real machine nature than even C. I'd like to think that makes one a better compiler user in the end. It's nice to know what all those nice high-level semantics are really costing you. I think part of my motivation, besides just doing the unexpected, is that long ago someone told me they took a class in "assembly and lisp"; basically, they taught programming by teaching you how to implement a higher-level language. Being young and stupid, I didn't see the point, but eventually I figured it out. It's never too late to re-do your childhood right, is it? I also think that the effort to write good code at such a low level is very good discipline. At least, I find it so, because the consequences of good and bad design become magnified. The absence of scope nesting and the difficulty of doing many simple operations really makes factoring out a vocabulary of small toolkit functions useful, for example, and that's not a bad discipline to reinforce. I just created a couple of functions whose body is a single shift simply because it enforced some abstraction and the names are documentation. I can always move the body into the header and let LLVM inline them if I want to optimize away the function call overhead (not that there is any great need to do that in a learning tool). (It's easy to tell which parts of the code I care about. The expression representation is pretty cleanly divided into a toolkit. The user interaction loop is a big fat function I didn't take the time to decompose.) Apropos of nothing, learning that I'm not going to use invoke/unwind puts me back a bit while I bloat and uglify the evaluator code with exception tests and unwinding, but I'm not that far from Turing completeness now and that's kind of a good feeling. :-) I probably didn't oblige myself to go further than that unless I just want to. Dustin