Alessandro Di Federico via llvm-dev
2017-Jan-12 19:44 UTC
[llvm-dev] llvmcpy: yet another Python binding for LLVM
Hi, I wrote yet another [1,2] Python binding for LLVM! I'm doing this because llvmlite has some serious limitations: 1) it cannot parse an existing IR, only create new modules [3], 2) it keeps its own representation of the IR (which is less memory efficient than the LLVM one), and 3) each llvmlite version supports a single LLVM version. Considering that my need is to load modules of hundreds of MiB, this is was kind of a problem. So I've come up with a "Python API generator". Basically it uses CFFI [4] to parse the LLVM-C API headers and automatically generate (using some heuristics) a Pythonic API, with classes, properties and the like. I've quickly tested it with LLVM 3.4, 3.8 and 3.9, and, for its simplicity, does a good job. It also supports multiple LLVM installations (it uses the one of the first llvm-config in path). I'd be happy to have some feedback, give it a look: https://rev.ng/llvmcpy -- Alessandro Di Federico PhD student at Politecnico di Milano [1] http://www.llvmpy.org/ [2] https://github.com/numba/llvmlite [3] https://github.com/numba/llvmlite/issues/157 [4] http://cffi.readthedocs.io/en/latest/
Philip Reames via llvm-dev
2017-Jan-13 01:30 UTC
[llvm-dev] llvmcpy: yet another Python binding for LLVM
Using something like CFFI to autogenerate bindings is definitely a good approach to this problem. It'll produce bindings which aren't entirely idiomatic for python, but they'll at least be reasonable likely to remain in sync. This also has the nice property that new additions to the C API get picked up without manual work; this should serve to incentive contribution in this area. You mention in your readme that you had to slightly modify the LLVM C headers to get this approach to work. Can you point out a couple of example changes? Maybe these are things we should consider taking upstream. I've not familiar with the details of CFFI. Are the bindings it generates for a particular set of headers specific to the machine it's generated on? Or could the resulting bindings be published and reused directly? If so, hosting a set of bindings for previous releases would be a useful service. Philip On 01/12/2017 11:44 AM, Alessandro Di Federico via llvm-dev wrote:> Hi, I wrote yet another [1,2] Python binding for LLVM! I'm doing this > because llvmlite has some serious limitations: 1) it cannot parse an > existing IR, only create new modules [3], 2) it keeps its own > representation of the IR (which is less memory efficient than the LLVM > one), and 3) each llvmlite version supports a single LLVM version. > > Considering that my need is to load modules of hundreds of MiB, this > is was kind of a problem. > So I've come up with a "Python API generator". Basically it uses CFFI > [4] to parse the LLVM-C API headers and automatically generate (using > some heuristics) a Pythonic API, with classes, properties and the like. > > I've quickly tested it with LLVM 3.4, 3.8 and 3.9, and, for its > simplicity, does a good job. It also supports multiple LLVM > installations (it uses the one of the first llvm-config in path). > > I'd be happy to have some feedback, give it a look: > > https://rev.ng/llvmcpy > > -- > Alessandro Di Federico > PhD student at Politecnico di Milano > > [1] http://www.llvmpy.org/ > [2] https://github.com/numba/llvmlite > [3] https://github.com/numba/llvmlite/issues/157 > [4] http://cffi.readthedocs.io/en/latest/ > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Alessandro Di Federico via llvm-dev
2017-Jan-13 09:03 UTC
[llvm-dev] llvmcpy: yet another Python binding for LLVM
On Thu, 12 Jan 2017 17:30:09 -0800 Philip Reames <listmail at philipreames.com> wrote:> You mention in your readme that you had to slightly modify the LLVM C > headers to get this approach to work. Can you point out a couple of > example changes? Maybe these are things we should consider taking > upstream.Take a look at the `clean_include_file` function: https://github.com/revng/llvmcpy/blob/master/llvmcpy/llvm.py#L342 Basically CFFI doesn't handle enum entries whose valus is computed through an expression. In the LLVM-C API sometimes we have 1 << 8. Also, static inline functions are not handled too (CFFI only handles function prototypes), so I've to strip them away. I'm not sure it'll ever be possible to handle unmodified LLVM-C API headers with no modifications, and given that one explicit aim is to support older versions of LLVM I'd have to keep that code anyway. It would be nice, however if that code doesn't have to grow in the future (e.g., having sophisticated expression as enum values). A thing I like about the C API is the consistency in function naming like having LLVMGetSomething/LLVMSetSomething pairs, LLVMCountSomethings/LLVMCountSomethings pairs and LLVMGetFirstSomething/LLVMGetNextSomething pairs. What I'd need would be the ability to know the name of the arguments, which CFFI doesn't provide. That would allow me to set up slightly more robust heuristics. For instance I'm now transforming a pair of pointer arguments followed by an integer as a pointer to an array plus its size, and it's fine in current versions of LLVM but it's not very robust. Same argument for error messages, having the argument name would help. But this is more a CFFI issue.> I've not familiar with the details of CFFI. Are the bindings it > generates for a particular set of headers specific to the machine > it's generated on? Or could the resulting bindings be published and > reused directly? If so, hosting a set of bindings for previous > releases would be a useful service.I'm not entirely sure they're portable across OS/architectures. What would be the use case? It takes a moment to generate the bindings but it's something the module will lazily do for you only once. -- Alessandro Di Federico PhD student at Politecnico di Milano