Patrick Walton
2008-Nov-26 21:34 UTC
[LLVMdev] Proposal for TableML, llvmc2 configuration language
Hi, I've been working on a proof of concept for a new configuration language for LLVM: specifically for my needs in llvmc2, but I have tried to make it as generic as possible for use throughout LLVM if other projects would like to make use of it. It's a compiler that compiles a near-subset of Standard ML to C++, with an architecture deliberately very similar to TableGen. The code is not yet ready to be merged by any means - it has many failure cases and may not compile at any given time - but I thought that before I go further I should send a proposal to the list. The WIP code, for the curious, is here: http://github.com/pcwalton/llvm-nw/tree/miniml If TableGen is a language that allows users to specify records of domain-specific information, TableML is designed to be a configuration language that is designed to be allow users to specify how to *construct* records of domain-specific information. TableML has a plugin architecture in which at any given time one of several backends is in use, just as in TableGen. The backends specify one or more record types and definitions. TableML then reads a configuration file, evaluates the definitions, and passes the results to the backend for serialization. For instance, we might have a RegisterInfo backend that declares a definition of "RegisterNames : string list". Then we could have a TableML input file like this: def val RegisterInfo = [ "eax", "ebx", "ecx", "edx" ] Or we could have a more complex one that performs computation to produce the result. val make32bit = (fn x => strcat("e", x)) def val RegisterInfo = map make32bit [ "ax", "bx", "cx", "dx" ] Obviously, this example is somewhat contrived, but it's just to illustrate that arbitrary computation is allowed (and is performed at compile time), as long as the definitions end up with the correct types. This could be thought of as a generalization of the "class" and "multiclass" concepts in TableGen. Also notice that, like all ML-based languages, TableML is strongly typed, and it makes heavy use of Hindley-Milner type inference. (The parser, lexer, and typechecker are all coded already, by the way, just not very well tested at the moment.) The subset of Standard ML that TableML supports is essentially the one shown here: http://www.macs.hw.ac.uk/ultra/compositional-analysis/type-error-slicing/slicing.cgi Now the upshot of this for the compiler driver is that function types are acceptable types for definitions. This means that, unlike TableGen, backends that want to allow scripting (which is currently just llvmc2) don't have to define their own programming languages. Instead, they can simply request a definition with a function type (e.g. SomeFunction : int -> int). TableML will hand the AST for the function, as well as its values, over to the backend for emission as C++ code. The backend is free to generate any C++ code it wants for the typed ASTs (of course, some support routines could be added to the base to make this easier). So, in summary, there are two main benefits to TableML that I see, depending on the backend/use case: (1) Users of backends that don't need scripting support can benefit from arbitrary computation in order to express the records, more than the macro facility that TableGen provides. (2) Users of backends that do need scripting support don't have to define their own programming languages, without any run-time performance loss when compared to TableGen. I'd definitely appreciate any comments on this proposal! I'd also be happy to clarify any issues with this explanation. Patrick
Mikhail Glushenkov
2008-Nov-27 21:42 UTC
[LLVMdev] Proposal for TableML, llvmc2 configuration language
Hi Patrick,> I've been working on a proof of concept for a new configuration language > for LLVM: specifically for my needs in llvmc2, but I have tried to make > it as generic as possible for use throughout LLVM if other projects > would like to make use of it.Your proposal seems interesting - I especially like that you are using a functional language. When your compiler will be able to generate llvmc plugins, it will provide a nice TableGen alternative for llvmc.> val make32bit = (fn x => strcat("e", x)) > def val RegisterInfo = map make32bit [ "ax", "bx", "cx", "dx" ]It'd probably be nice if it was possible to syntactically distinguish between what is evaluated at run-time and at compile-time (like in Template Haskell).> The subset of Standard ML that TableML supports is essentially > the one shown here: > http://www.macs.hw.ac.uk/ultra/compositional-analysis/type-error-slicing/slicing.cgi As I understand from this link, TableML supports only lists and some primitive types (no algebraic datatypes). That'd be enough for llvmc, but I can't speak for the other backends; you'll probably need to integrate some additional syntactic sugar to cater to their needs.> This means that, unlike TableGen, > backends that want to allow scripting (which is currently just llvmc2) > don't have to define their own programming languages. Instead, they can > simply request a definition with a function type (e.g. SomeFunction : > int -> int). TableML will hand the AST for the function, as well as its > values, over to the backend for emission as C++ code.Another (pie-in-the-sky) option is to compile TableML to LLVM IR and integrate llvmc with the JIT engine. That way llvmc won't even need a C++ compiler present to support plugins. But that's probably too heavyweight for a humble compiler driver:)
Patrick Walton
2008-Nov-28 01:35 UTC
[LLVMdev] Proposal for TableML, llvmc2 configuration language
> It'd probably be nice if it was possible to syntactically distinguish between > what is evaluated at run-time and at compile-time (like in Template Haskell).Well, it is in a sense: things evaluated at run time will always be inside lambda functions, while things evaluated at compile time aren't.> As I understand from this link, TableML supports only lists and some primitive > types (no algebraic datatypes). > > That'd be enough for llvmc, but I can't speak for the other > backends; you'll probably need to integrate some additional > syntactic sugar to cater to their needs.The current plan is that backends will be able to define their own datatypes in the Standard ML sense, with explicit constructors.> Another (pie-in-the-sky) option is to compile TableML to LLVM IR and integrate > llvmc with the JIT engine. > That way llvmc won't even need a C++ compiler present to support plugins. > But that's probably too heavyweight for a humble compiler driver:)At first I considered that, but this might create a bootstrapping problem: if TableML is to become an alternative to TableGen, then we could get into a situation in which TableML is needed to compile LLVM, and LLVM is needed to compile TableML. Thanks for the feedback! Patrick
Mike Stump
2008-Nov-29 21:58 UTC
[LLVMdev] Proposal for TableML, llvmc2 configuration language
On Nov 26, 2008, at 1:34 PM, Patrick Walton wrote:> I've been working on a proof of concept for a new configuration > language > for LLVM: specifically for my needs in llvmc2, but I have tried to > make > it as generic as possible for use throughout LLVM if other projects > would like to make use of it. It's a compiler that compiles a > near-subset of Standard ML to C++, with an architecture deliberately > very similar to TableGen.Not lisp? [ runs away ducking ] :-)