Folks, Following the discussion with Nico and others, I've created PR20683 to discuss about the implementation of a generic and externalised target specific parsing API for LLVM, Clang and others. I have a vague plan involving a generic class (say TargetParser) in lib/Target that is accessible as an API to any tool that needs target specific parsing. The idea is then to let targets implement their own versions of it on a generic part of the code (still lib/Target) so that we don't break tools if we don't build every back-end (or any back-end) with LLVM. Maybe this could be fixed on CMake, maybe not. This class should also allow for customization from the part of the tools, to either add functionality to existing functions (like pre-parsing, post-parsing, de-mangling etc) or adding completely new functions. Also, since a common side effect of parsing architectural parameters is to set flags in specific classes, we should expect that every tool will *have* to override the "updateFlags" method inside the tool (including LLVM's integrated assembler) to use their own target-specific structures. I'm hoping that this makes sense. Please let me know if there's any major flaw, or existing infrastructure that we could be re-using. But from the looks of target parsing in both LLVM and Clang, there isn't... :/ cheers, --renato
Renato, Could you give a couple of examples where this would be useful? I find myself without the context to really understand your proposal. Are we discussing target specific language extensions? Assembly parsing? Something else entirely? Philip On 08/16/2014 05:43 AM, Renato Golin wrote:> Folks, > > Following the discussion with Nico and others, I've created PR20683 to > discuss about the implementation of a generic and externalised target > specific parsing API for LLVM, Clang and others. > > I have a vague plan involving a generic class (say TargetParser) in > lib/Target that is accessible as an API to any tool that needs target > specific parsing. The idea is then to let targets implement their own > versions of it on a generic part of the code (still lib/Target) so > that we don't break tools if we don't build every back-end (or any > back-end) with LLVM. Maybe this could be fixed on CMake, maybe not. > > This class should also allow for customization from the part of the > tools, to either add functionality to existing functions (like > pre-parsing, post-parsing, de-mangling etc) or adding completely new > functions. > > Also, since a common side effect of parsing architectural parameters > is to set flags in specific classes, we should expect that every tool > will *have* to override the "updateFlags" method inside the tool > (including LLVM's integrated assembler) to use their own > target-specific structures. > > I'm hoping that this makes sense. Please let me know if there's any > major flaw, or existing infrastructure that we could be re-using. But > from the looks of target parsing in both LLVM and Clang, there > isn't... :/ > > cheers, > --renato > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On 18 August 2014 17:53, Philip Reames <listmail at philipreames.com> wrote:> Could you give a couple of examples where this would be useful? I find > myself without the context to really understand your proposal. Are we > discussing target specific language extensions? Assembly parsing? > Something else entirely?Hi Philip, Sorry, the bug track (two, for now) should have more context, but here's a summary: Nico started trying to solve a problem where ".fpu neon" wouldn't change the instruction set in an assembly file, and he found that he needed a parser identical to the one in Clang to parse the exact same semantics ("neon", "vfpv3", etc). The ARM assembly also has a .cpu which is pretty much the same story, so he thought that we could share the parser on both sides, maybe exposing some functionality from LLVM to Clang. The problem, as Reid mentioned, is that LLVM doesn't need to compile with all back-ends, and implementation in a back-end that is not compiled will generate link time errors on a statically compiled Clang or run time errors on a dynamically compiled Clang. But this also opens a can of worms, where we start to leak target specific knowledge from the back door, without a properly defined API that has to be respected over the years, once everyone else has forgotten that we've done that. I blocked such a patch from going in, because we'd be replacing code duplication with implicit coupling of far away pieces of code, but now that the original problem is solved (by duplicating code), we have to fix the duplication problem. Since duplication was already there, especially in the sub-arch parsing, we should be able to sweep a few of other similar bugs away with a single fix. Since parsing of strings is generic, and should be used by all tools (that support -mfpu), that piece of code can live in lit/Target and Clang can rest assured that it'll be there. But the second part, the Clang-specific one, will only live in Clang, and use Clang's own structures to hold and change the sub-architecture feature flags wherever it's needed. Same for all other tools, and same for MC assembler. So, as an example in the assembler case, parseDirective sees ".fpu", calls parseDirectiveFPU which uses ArchParser->parseFPU() returning an enum/bitfield owned by ArchParser, which is then relayed to the assembler so it can call setAvailableFeatures(FPU); In Clang, the driver will observe -mfpu and call ArchParser->parseFPU() which, again, will return the same enum/bitfield to the driver, which will then update its own flags to -cc1, etc. We can use enum/bitfields as a communication method, re-using most of what's in use right now to identify those things, but move into one single place. That would be the quick and simple solution. If it turns out we need some more complex fiddling, we might have to create some call-backs (virtual preParse(), virtual postParse() that do nothing on the base class) etc, so that Clang can override them and do what's needed, but that's only if we can't do it straight with enums. cheers, --renato
Note that it is not just option parsing. It is about the bits of information about a target that we expect to be available even when that target is not compiled. It includes target options like -mfpu, but should also include things like creating the DataLayout string which is currently duplicated in clang. A slightly different option would be to just require the llvm targets to implement a virtual interface, but that would mean that something as basic as clang -target armv7-pc-linux -### -c test.c would fail if the llvm arm backend was not compiled. On 16 August 2014 08:43, Renato Golin <renato.golin at linaro.org> wrote:> Folks, > > Following the discussion with Nico and others, I've created PR20683 to > discuss about the implementation of a generic and externalised target > specific parsing API for LLVM, Clang and others. > > I have a vague plan involving a generic class (say TargetParser) in > lib/Target that is accessible as an API to any tool that needs target > specific parsing. The idea is then to let targets implement their own > versions of it on a generic part of the code (still lib/Target) so > that we don't break tools if we don't build every back-end (or any > back-end) with LLVM. Maybe this could be fixed on CMake, maybe not. > > This class should also allow for customization from the part of the > tools, to either add functionality to existing functions (like > pre-parsing, post-parsing, de-mangling etc) or adding completely new > functions. > > Also, since a common side effect of parsing architectural parameters > is to set flags in specific classes, we should expect that every tool > will *have* to override the "updateFlags" method inside the tool > (including LLVM's integrated assembler) to use their own > target-specific structures. > > I'm hoping that this makes sense. Please let me know if there's any > major flaw, or existing infrastructure that we could be re-using. But > from the looks of target parsing in both LLVM and Clang, there > isn't... :/ > > cheers, > --renato
On 18 August 2014 22:58, Rafael EspĂndola <rafael.espindola at gmail.com> wrote:> Note that it is not just option parsing. It is about the bits of > information about a target that we expect to be available even when > that target is not compiled.Exactly, this is why the sub-arch information needs to be tool-specific, but the parsing (even of the DataLayout) doesn't, since this is identical to all tools. Either returning agreed enum values, or letting the implementation override some methods or even using template policies would do the trick. We just need to find the simplest implementation for this case.> A slightly different option would be to just require the llvm targets > to implement a virtual interface, but that would mean that something > as basic as > > clang -target armv7-pc-linux -### -c test.c > > would fail if the llvm arm backend was not compiled.Oh, so there's why I think the bit setting needs to be tool-specific. Clang has its own way of setting the arch bits, and this ArchParser doesn't know anything about it, nor it should. Let me give it a try, completely untested... In LLVM: lib/Target/ArchParser.cpp: ArchParser { parseFPU() override; parseCPU() override; ... setFPUBits() override; setCPUBits() override; } ARMArchParser : public ArchParser { parseFPU() { ... } parseCPU() { ... } ... } X86ArchParser : ... In Clang: lib/Driver/ArchParser.cpp: template <class BaseArchParser> ClangArchParser : public BaseArchParser { setFPUBits() { ... }; setCPUBits() { ... }; ... } ArchParser GetArchParser(StringRef TargetName) { if ("ARM") return ClangArchParser<ARMArchParser>(); ... } in llc, lli, lld, integrated assemblers, do like Clang. ClangArchParser's setFPU will have nothing from the ARM back-end in it, because its bits will be clang-specific. I know this still keeps ARM knowledge in Clang, but it moves into a specific area that other parts of Clang can access, and will help us leave the Clang-specific sub-arch knowledge in Clang, and ARM specific option parsing in LLVM. Currently, the behaviour is to allow for all options to work on -###, including all back-ends that aren't compiled, and that's how Clang tests behave. To change that would need a major change in the tests. If we really want to soft-fail -### and relatives when a back end is not compiled, we'll have to find a solution for it in addition to change all the tests. But that's step 2. cheers, --renato