Hi everyone, I'm currently in the first year of my PhD, and I'm going to be looking at an experimental IR for my thesis. After looking at a variety of research compilers I've come to the conclusion that LLVM is the nicest to work with for my purposes. I was considering writing the code to construct this experimental IR from LLVM assembly, and then at the end of the process (i.e. new optimisations and transformations, etc.) I'll translate back into LLVM assembly to allow compilation to continue. This keeps everything modular, means I can work with "real" languages rather than toy ones, and so on. I was thinking of generating my own lexer and parser for LLVM assembly. I'm aware that between the specification here: http://llvm.org/docs/LangRef.html and also the comments in LLParser.cpp there is information about the grammar for .ll files, but is there any documentation that simply states the full grammar, much in the style of this C grammar: http://www.lysator.liu.se/c/ANSI-C-grammar-y.html or this Python grammar? http://www.python.org/doc/2.5.2/ref/grammar.txt Does anyone have any thoughts or experience with parsing .ll files? I'd like to add that if I've managed to somehow miss the document I'm looking for, then I'm willing to make a paper dunce hat and wear it for the rest of the week. Best, James -- View this message in context: http://www.nabble.com/Rolling-my-own-LLVM-assembly-language-parser-tp22704803p22704803.html Sent from the LLVM - Dev mailing list archive at Nabble.com.
John Criswell
2009-Mar-25 17:12 UTC
[LLVMdev] Rolling my own LLVM assembly language parser
jstanier wrote:> Hi everyone, > > I'm currently in the first year of my PhD, and I'm going to be looking at an > experimental IR for my thesis. After looking at a variety of research > compilers I've come to the conclusion that LLVM is the nicest to work with > for my purposes. I was considering writing the code to construct this > experimental IR from LLVM assembly, and then at the end of the process (i.e. > new optimisations and transformations, etc.) I'll translate back into LLVM > assembly to allow compilation to continue. This keeps everything modular, > means I can work with "real" languages rather than toy ones, and so on. >It seems to me that an easier approach would be to convert in-memory LLVM IR to your experimental IR using the LLVM libraries, then do whatever you do with your experimental IR, and then convert the code back into in-memory LLVM IR for LLVM based optimizations and code generation. The libraries for manipulating LLVM IR are well designed, reasonably documented, and seem to me to be easier to work with than creating your own LLVM IR parser. -- John T.> I was thinking of generating my own lexer and parser for LLVM assembly. I'm > aware that between the specification here: > > http://llvm.org/docs/LangRef.html > > and also the comments in LLParser.cpp there is information about the grammar > for .ll files, but is there any documentation that simply states the full > grammar, much in the style of this C grammar: > > http://www.lysator.liu.se/c/ANSI-C-grammar-y.html > > or this Python grammar? > > http://www.python.org/doc/2.5.2/ref/grammar.txt > > Does anyone have any thoughts or experience with parsing .ll files? I'd like > to add that if I've managed to somehow miss the document I'm looking for, > then I'm willing to make a paper dunce hat and wear it for the rest of the > week. > > Best, > > James >
Anton Korobeynikov
2009-Mar-25 17:35 UTC
[LLVMdev] Rolling my own LLVM assembly language parser
Hello,> I was thinking of generating my own lexer and parser for LLVM assembly. I'm > aware that between the specification here:Why do you need this? There is already a parser library inside LLVM framework and you can use it directly without any problems. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University
Thank you both for your answers. The only reason I was interested in not using the built-in parsing library was that it would give me more flexibility over the language I program in, but if it means brushing up on my C++ then this isn't too much of a problem either. With regards to using the in-memory LLVM, that's also a good approach. However, I was thinking of structuring my thesis work as standalone tools (much like llvm-as and the others) as it would help me structure my work better. I hope this makes sense. Best, James -- View this message in context: http://www.nabble.com/Rolling-my-own-LLVM-assembly-language-parser-tp22704803p22707497.html Sent from the LLVM - Dev mailing list archive at Nabble.com.
On Wednesday 25 March 2009 17:12:34 John Criswell wrote:> jstanier wrote: > > Hi everyone, > > > > I'm currently in the first year of my PhD, and I'm going to be looking at > > an experimental IR for my thesis. After looking at a variety of research > > compilers I've come to the conclusion that LLVM is the nicest to work > > with for my purposes. I was considering writing the code to construct > > this experimental IR from LLVM assembly, and then at the end of the > > process (i.e. new optimisations and transformations, etc.) I'll translate > > back into LLVM assembly to allow compilation to continue. This keeps > > everything modular, means I can work with "real" languages rather than > > toy ones, and so on. > > It seems to me that an easier approach would be to convert in-memory > LLVM IR to your experimental IR using the LLVM libraries, then do > whatever you do with your experimental IR, and then convert the code > back into in-memory LLVM IR for LLVM based optimizations and code > generation. The libraries for manipulating LLVM IR are well designed, > reasonably documented, and seem to me to be easier to work with than > creating your own LLVM IR parser.I assume James is only considering reinventing this wheel because he is not using C++. LLVM does not play nice with other languages because C++ is quite uninteroperable. There are C bindings to LLVM that make it a lot easier but, of course, they are far from complete. So I can fully appreciate the desire to do something like this. However, my recommendation would be to augment LLVM with better interop rather than reimplement bits of it in other languages. So I would advise James to work on a more language agnostic machine-readable format for LLVM's IR (e.g. XML based) and contribute code to LLVM that lets it IO in that format as well as the current human-readable form. As we discussed before, something like an autogenerated XML-RPC server to the whole of LLVM would be a much better solution offering easy interop with a huge variety of languages without having to write and maintain all of these bindings. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e
Apparently Analagous Threads
- [LLVMdev] Rolling my own LLVM assembly language parser
- [LLVMdev] Rolling my own LLVM assembly language parser
- [LLVMdev] Rolling my own LLVM assembly language parser
- [LLVMdev] Rolling my own LLVM assembly language parser
- [LLVMdev] Rolling my own LLVM assembly language parser