Luke Kenneth Casson Leighton via llvm-dev
2019-Sep-22 15:24 UTC
[llvm-dev] porting AMDVLK to the Libre RISC-V 3D GPU: NLNet EUR 50, 000 Grant application
hi all, [please do cc the libre-riscv-dev list for this discussion, thanks] after speaking to Michiel from NLNet, i am looking to put in an additional grant application for EUR 50,000, so that charitable donations to developers are available in order to convert AMD's AMDVLK driver, replacing its amdgpu LLVM IR backend with one that outputs RISC-V assembler instead. in stages, that will become vectorised assembler, and, later, will have special accelerated custom 3D opcodes (texturisation, OpenCL opcodes such as atan2 etc.) added as well. would anybody be interested to be the intended recipient(s) of such charitable donations? (they would be tax-deductible in most jurisdictions). universities may also be recipients, such that they may then hire interns (or otherwise do what they wish). please do note that Corporations may *not* be the recipient of an NLNet charitable donation, but individuals are. the deadline is oct 1st: there will be another opportunity (dec 1st) however i would prefer to meet the oct 1st deadline. the technical details - starting point and plan - is as follows: * the code start-point is here: https://github.com/GPUOpen-Drivers/AMDVLK * this code uses both the (Reference) Khronos SPIR-V to LLVM-IR compiler - augmented - as well as a forked (being slowly updated to mainline) version of LLVM * the AMD-forked version of LLVM contains not only support for AMDGPU texturisation, from what i gather from an analysis by Jacob Lifshay it also is one of the only LLVM IR backends that support *explicit* full-function vectorisation intrinsics. this is *not* the same as sub-vector types (vec4, vec3 etc.) * the "normal" versions of LLVM IR *lose* the explicit vectorsation information during the front-end to back-end conversion process, and, in the case of e.g. Vector backends such as the RISC-V RVV engine, will "opportunistically" *reinstate* (recover) the very vectorisation information that the AMDGPU SPIR-V to LLVM-IR already explicitly carries through, as part of [some] of the IR-to-assembly conversion passes. * thus we cannot simply start from mainline LLVM because it simply does not contain support for full-function explicit vector-looping which AMD very deliberately added to their fork of LLVM. * we need to *drop* the queue / pipe / etc. code within AMDVLK (contained within AMD's "PAL" library), and replace it with direct and explicit RISC-V assembler-generation. a good - perfectly acceptable - starting point for this would be the current mainline RISC-V LLVM JIT which has just been upstreamed. this latter is the core of the work, and requires some ancillary explanation. most GPUs are separated from the main CPU by way of shared memory - usually but not always over a PCIe Bus. AMD's PAL library is effectively a fully-functioning RPC subsystem that pushes AMDGPU assembly code (compiled JIT on the *CPU*) over to the Radeon GPU, pushes it the data it needs as well, carries out synchronisation and blah blah blah you get the general idea, all of which is hugely complicated. the Libre RISC-V SoC is a *HYBRID* CPU / GPU. the CPU *IS* the GPU. the GPU *IS* the CPU. the accelerated texture assembly code instructions will be added *to the CPU*. the YUV2RGB acceleration assembly code instructions will be added *to the CPU*. the atan2 and other transcendental assembly code instructions will be added *to the CPU*. all of these will be done over time, on an ongoing basis, starting initially from "base" RISC-V instructions. thus what we need doing is actually a drastic *simplification* of the AMDVLK assembly-generation code. question (which more clearly illustrates where we are going with this): why are we not just using swiftshader and be done with it? surely you are doing a "software 3D driver", right? https://github.com/google/swiftshader the answer's "no, we are not". the reason is quite simple: swiftshader was *specifically* designed to be a software-only 3D GPU renderer, for use on *scalar* processors that may - or may not - have SIMD instructions [NOT Vector Engines]. as such, as mentioned above, it *does not have explicit full-function vectorisation support of any kind*. it may have support for sub-vector types (vec3, vec4), but it *does not* have built-in support for predication etc. all of which is vital information for a GPU and was the whole reason why SPIR-V was created as an augmented version of LLVM-IR in the first place. in addition, swiftshader simply does not have support for texturisation or any other *hardware* accelerated features present in GPUs. this is reflected right back throughout the entire source tree and it would be far too much work to try to add it. so we are *starting* from a software-only 3D driver, and then adding hardware-accelerated opcodes *to* that driver. note very very specifically that there will *NOT* be an associated kernel driver involved here, nor will those instructions require special "privileges". they will be literally callable just like any other opcode such as FMUL, FADD, and FDIV and so on. it's quite a fascinating and exciting project, that, in its simplest form, should be relatively straightforward. augmentations and optimisations may be done incrementally to increase performance and add experimental opcodes as the project progresses. also please note: there's no actual contractual obligation, expressed or implied. this *really is* donations. it's definitely *not* "work-for-hire" and this is definitely *NOT* a "Job Proposal". if a milestone is completed, you receive the donation (direct from NLNet, *not* from me, or our team). any questions please do ask. one important condition: we need at least one person with an EU home address, who is also an EU citizen. that EU citizen does not have to be living *in* the EU at the time of the application. they can be a UK citizen, because no decision has been made yet, there. best, l.