Alessandro Di Federico via llvm-dev
2017-Oct-29 13:49 UTC
[llvm-dev] A query language for LLVM IR (XPath)
Hi, sometimes when dealing with LLVM IR getting to a desired point of the code is a bit cumbersome, in particular if you're instrumenting existing code. A lot of nested loops and if checks. Maybe all of this could be avoided by employing a query language. Since an LLVM module can be seen as a sort of tree with attributes, I think that reusing an existing query language for XML would be appropriate. In particular I choose XPath [1] since it's more expressive than, say, CSS selectors (e.g., you can move from the current element to the parent). Therefore, in a spare night, I took pugixml [2], a lightweight XML parser with XPath support, stripped away everything was XML-specific and adapted it so that it could query an arbitrary tree, as long as a class providing certain traits is provided. Attached you can find the class to query a LLVM module and example LLVM module (using LLVM 3.8, but newer versions should do to). The current implementation pretends that a module looks like the following XML tree (more or less): <main.ll> <main> <basicblock1> <alloca /> <alloca /> ... </basicblock1> ... </main> </main.ll> Additional information could be encoded in attributes. Please note that the queries are done on the LLVM IR directly, no XML tree is materialized. In the following you can find some examples: $ # Find all the basic blocks containing at least an alloca $ llvm-xpath '/main/*[count(alloca) > 0]' main.ll %1 = alloca i32, align 4 %2 = alloca i32, align 4 %i = alloca i32, align 4 store i32 0, i32* %1, align 4 store i32 %argc, i32* %2, align 4 %3 = load i32, i32* %2, align 4 store i32 %3, i32* %i, align 4 br label %4 $ # Find all store instructions $ llvm-xpath '/*/*/store' store i32 0, i32* %1, align 4 store i32 %argc, i32* %2, align 4 store i32 %3, i32* %i, align 4 store i32 %6, i32* %i, align 4 Obviously this doesn't have to be exclusively a command line tool, but we could have something like: for (auto *Store : TheModule.xpath<StoreInst>("/*/*/store")) /* ... */ I'm not releasing the full code yet since it's very much work in progress, but if anyone is interested in such a thing, just ping me. The applications could range from using it in existing code to just provide it for fast prototyping, e.g., in llvmcpy [3]. Obviously there are some open questions, such as how to deal with operands, which could lead to an infinite tree, or how to organize attributes. But it should be doable. --- Alessandro Di Federico PhD student at Politecnico di Milano [1] https://en.wikipedia.org/wiki/XPath [2] https://pugixml.org/ [3] https://github.com/revng/llvmcpy -------------- next part -------------- A non-text attachment was scrubbed... Name: main.ll Type: application/octet-stream Size: 1274 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171029/c913b314/attachment.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: llvm-node.cpp Type: text/x-c++src Size: 5968 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171029/c913b314/attachment.cpp>
Sean Silva via llvm-dev
2017-Oct-31 17:00 UTC
[llvm-dev] A query language for LLVM IR (XPath)
This is so cool! I once had a similar idea but the way I was thinking about it ended up more complex than I had time to implement (I sketched it here: http://lists.llvm.org/pipermail/llvm-dev/2013-November/067720.html). Good idea using xpath to simplify the implementation and reuse existing languages/libraries as a starting point! On Oct 29, 2017 6:47 AM, "Alessandro Di Federico via llvm-dev" < llvm-dev at lists.llvm.org> wrote:> Hi, sometimes when dealing with LLVM IR getting to a desired point of > the code is a bit cumbersome, in particular if you're instrumenting > existing code. A lot of nested loops and if checks. > > Maybe all of this could be avoided by employing a query language. Since > an LLVM module can be seen as a sort of tree with attributes, I think > that reusing an existing query language for XML would be appropriate. > > In particular I choose XPath [1] since it's more expressive than, say, > CSS selectors (e.g., you can move from the current element to the > parent). > > Therefore, in a spare night, I took pugixml [2], a lightweight XML parser > with XPath support, stripped away everything was XML-specific and > adapted it so that it could query an arbitrary tree, as long as a class > providing certain traits is provided. > > Attached you can find the class to query a LLVM module and example LLVM > module (using LLVM 3.8, but newer versions should do to). > > The current implementation pretends that a module looks like the > following XML tree (more or less): > > <main.ll> > <main> > <basicblock1> > <alloca /> > <alloca /> > ... > </basicblock1> > ... > </main> > </main.ll> > > Additional information could be encoded in attributes. > Please note that the queries are done on the LLVM IR directly, no XML > tree is materialized. > > In the following you can find some examples: > > $ # Find all the basic blocks containing at least an alloca > $ llvm-xpath '/main/*[count(alloca) > 0]' main.ll > > %1 = alloca i32, align 4 > %2 = alloca i32, align 4 > %i = alloca i32, align 4 > store i32 0, i32* %1, align 4 > store i32 %argc, i32* %2, align 4 > %3 = load i32, i32* %2, align 4 > store i32 %3, i32* %i, align 4 > br label %4 > > $ # Find all store instructions > $ llvm-xpath '/*/*/store' > store i32 0, i32* %1, align 4 > store i32 %argc, i32* %2, align 4 > store i32 %3, i32* %i, align 4 > store i32 %6, i32* %i, align 4 > > Obviously this doesn't have to be exclusively a command line tool, but > we could have something like: > > for (auto *Store : TheModule.xpath<StoreInst>("/*/*/store")) > /* ... */ > > I'm not releasing the full code yet since it's very much work in > progress, but if anyone is interested in such a thing, just ping me. > The applications could range from using it in existing code to just > provide it for fast prototyping, e.g., in llvmcpy [3]. > > Obviously there are some open questions, such as how to deal with > operands, which could lead to an infinite tree, or how to organize > attributes. But it should be doable. > > --- > Alessandro Di Federico > PhD student at Politecnico di Milano > > [1] https://en.wikipedia.org/wiki/XPath > [2] https://pugixml.org/ > [3] https://github.com/revng/llvmcpy > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171031/67c98be4/attachment.html>
Dean Michael Berris via llvm-dev
2017-Oct-31 22:21 UTC
[llvm-dev] A query language for LLVM IR (XPath)
As much as I'm not a fan of most XML things, this application of XPath is *inspired*. This would be a great testing/query tool for tests. It would also be a great way to prototype passes. Looking forward to seeing something like this in llvm/tools/ ! Cheers> On 1 Nov 2017, at 04:00, Sean Silva via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > This is so cool! I once had a similar idea but the way I was thinking about it ended up more complex than I had time to implement (I sketched it here: http://lists.llvm.org/pipermail/llvm-dev/2013-November/067720.html <http://lists.llvm.org/pipermail/llvm-dev/2013-November/067720.html>). > > Good idea using xpath to simplify the implementation and reuse existing languages/libraries as a starting point! > > On Oct 29, 2017 6:47 AM, "Alessandro Di Federico via llvm-dev" <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > Hi, sometimes when dealing with LLVM IR getting to a desired point of > the code is a bit cumbersome, in particular if you're instrumenting > existing code. A lot of nested loops and if checks. > > Maybe all of this could be avoided by employing a query language. Since > an LLVM module can be seen as a sort of tree with attributes, I think > that reusing an existing query language for XML would be appropriate. > > In particular I choose XPath [1] since it's more expressive than, say, > CSS selectors (e.g., you can move from the current element to the > parent). > > Therefore, in a spare night, I took pugixml [2], a lightweight XML parser > with XPath support, stripped away everything was XML-specific and > adapted it so that it could query an arbitrary tree, as long as a class > providing certain traits is provided. > > Attached you can find the class to query a LLVM module and example LLVM > module (using LLVM 3.8, but newer versions should do to). > > The current implementation pretends that a module looks like the > following XML tree (more or less): > > <main.ll> > <main> > <basicblock1> > <alloca /> > <alloca /> > ... > </basicblock1> > ... > </main> > </main.ll> > > Additional information could be encoded in attributes. > Please note that the queries are done on the LLVM IR directly, no XML > tree is materialized. > > In the following you can find some examples: > > $ # Find all the basic blocks containing at least an alloca > $ llvm-xpath '/main/*[count(alloca) > 0]' main.ll > > %1 = alloca i32, align 4 > %2 = alloca i32, align 4 > %i = alloca i32, align 4 > store i32 0, i32* %1, align 4 > store i32 %argc, i32* %2, align 4 > %3 = load i32, i32* %2, align 4 > store i32 %3, i32* %i, align 4 > br label %4 > > $ # Find all store instructions > $ llvm-xpath '/*/*/store' > store i32 0, i32* %1, align 4 > store i32 %argc, i32* %2, align 4 > store i32 %3, i32* %i, align 4 > store i32 %6, i32* %i, align 4 > > Obviously this doesn't have to be exclusively a command line tool, but > we could have something like: > > for (auto *Store : TheModule.xpath<StoreInst>("/*/*/store")) > /* ... */ > > I'm not releasing the full code yet since it's very much work in > progress, but if anyone is interested in such a thing, just ping me. > The applications could range from using it in existing code to just > provide it for fast prototyping, e.g., in llvmcpy [3]. > > Obviously there are some open questions, such as how to deal with > operands, which could lead to an infinite tree, or how to organize > attributes. But it should be doable. > > --- > Alessandro Di Federico > PhD student at Politecnico di Milano > > [1] https://en.wikipedia.org/wiki/XPath <https://en.wikipedia.org/wiki/XPath> > [2] https://pugixml.org/ <https://pugixml.org/> > [3] https://github.com/revng/llvmcpy <https://github.com/revng/llvmcpy> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Dean -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171101/c5cfd6a8/attachment.html>
Apparently Analagous Threads
- A query language for LLVM IR (XPath)
- llvmcpy: yet another Python binding for LLVM
- Is there a python binding, or any other script binding, that has access to individual instructions?
- Is there a python binding, or any other script binding, that has access to individual instructions?
- Re: [PATCH v5 09/10] mllib: add XPath helper xpath_get_nodes()