Hi All, So as discussed I've started working on a TGSI backend for llvm to use as a way to get compute going on nouveau (and other gpu-s). I'm still learning all the ins and outs of llvm so I do not have much to show yet. I've rebased Francisco's (curro's) latest version on top of llvm trunk, and added a commit on top to actual get it build with the latest trunk. So currently I'm at the point where I've just taken Francisco's code, and made it compile, no more and no less. I have a git repo with this work available here: http://cgit.freedesktop.org/~jwrdegoede/llvm/ So the next step would be to test this and see if it actually does anything, questions: 1) Does anyone have a simple test case / command where I can invoke just llvm and get TGSI asm output to check ? 2) Assuming I get the above to (somewhat) work, is there a way to make llvm show the output of the various intermediate passes in a human readable form ? Regards, Hans
Hello Hans, Not to muddy the waters or anything, have you thought about the NIR integration that Rob was thinking about ? I'm pretty sure he'll be happy to have extra people helping him out. Cheers, Emil
On Fri, Nov 13, 2015 at 9:25 AM, Emil Velikov <emil.l.velikov at gmail.com> wrote:> Hello Hans, > > Not to muddy the waters or anything, have you thought about the NIR > integration that Rob was thinking about ? > I'm pretty sure he'll be happy to have extra people helping him out.How would that in any way plug into llvm or nouveau? There's no OpenCL C -> NIR, and there's no NIR -> nv50 IR... -ilia
On 11/13/2015 02:46 PM, Hans de Goede wrote:> Hi All,Hey Hans,> > So as discussed I've started working on a TGSI backend for > llvm to use as a way to get compute going on nouveau (and other gpu-s). > > I'm still learning all the ins and outs of llvm so I do not have > much to show yet. > > I've rebased Francisco's (curro's) latest version on top of llvm > trunk, and added a commit on top to actual get it build with the > latest trunk. So currently I'm at the point where I've just > taken Francisco's code, and made it compile, no more and no less. > > I have a git repo with this work available here: > > http://cgit.freedesktop.org/~jwrdegoede/llvm/Thanks for sharing your work. :-)> > So the next step would be to test this and see if it actually > does anything, questions: > > 1) Does anyone have a simple test case / command where I can > invoke just llvm and get TGSI asm output to check ? > > 2) Assuming I get the above to (somewhat) work, is there a > way to make llvm show the output of the various intermediate > passes in a human readable form ?Basically, you need to ask Clang to emit LLVM code for you, for example, this command will emit LLVM IR: clang -cc1 -cl-std=CL1.2 -emit-llvm -triple spir64-unknown-unknown kernel.cl Note that this command only works with an old LLVM version (I don't remember exactly). But in your case, and for that TGSI backend, I don't think there is a -emit-tgsi option which can directly output TGSI from OpenCL. The other way, and in my opinion the best, is to write a little C++ program based on Clang/LLVM API for generating TGSI code. To do that, you can have a look at src/gallium/state_trackers/clover/llvm/invocation.cpp which contains an example (but it seems to be outdated). Basically, you need to call that CompilerInvocation object with some parameters and all the stuff around. This should not take more than 100LOC in my opinion. I think the first step should be to emit LLVM IR before trying to get TGSI working. I could write that program for you if you want but I don't think to have time to do it during this weekend. Thanks.> > Regards, > > Hans-- -Samuel
On Fri, Nov 13, 2015 at 02:46:52PM +0100, Hans de Goede wrote:> Hi All, > > So as discussed I've started working on a TGSI backend for > llvm to use as a way to get compute going on nouveau (and other gpu-s). > > I'm still learning all the ins and outs of llvm so I do not have > much to show yet. > > I've rebased Francisco's (curro's) latest version on top of llvm > trunk, and added a commit on top to actual get it build with the > latest trunk. So currently I'm at the point where I've just > taken Francisco's code, and made it compile, no more and no less. > > I have a git repo with this work available here: > > http://cgit.freedesktop.org/~jwrdegoede/llvm/ > > So the next step would be to test this and see if it actually > does anything, questions: > > 1) Does anyone have a simple test case / command where I can > invoke just llvm and get TGSI asm output to check ? >The easiest way to do this is with the llc tool which ships with llvm. It compiles LLVM IR to target code, which in this case is tgsi. I would recommend taking one of the simple examples from test/CodeGen/AMDGPU (you may need to get these from llvm trunk, not sure what llvm version you are using). To use llc: llc -march=tgsi input.ll -o - This will output TGSI. If you want to use clang to compile OpenCL C kernels to clang you will need to teach clang about the TGSI target by implementing the a sub-class of TargetInfo in lib/Basic/Targets.cpp. Look at the AMDGPU target for examples, but I recommend starting with llc.> 2) Assuming I get the above to (somewhat) work, is there a > way to make llvm show the output of the various intermediate > passes in a human readable form ? >You can pass -print-before-all or -print-after-all to dump the intermediate forms.> Regards, > > Hans > _______________________________________________ > mesa-dev mailing list > mesa-dev at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Hans de Goede
2015-Nov-16  14:33 UTC
[Nouveau] [Mesa-dev] llvm TGSI backend (WIP) questions
Hi, On 13-11-15 19:51, Tom Stellard wrote:> On Fri, Nov 13, 2015 at 02:46:52PM +0100, Hans de Goede wrote: >> Hi All, >> >> So as discussed I've started working on a TGSI backend for >> llvm to use as a way to get compute going on nouveau (and other gpu-s). >> >> I'm still learning all the ins and outs of llvm so I do not have >> much to show yet. >> >> I've rebased Francisco's (curro's) latest version on top of llvm >> trunk, and added a commit on top to actual get it build with the >> latest trunk. So currently I'm at the point where I've just >> taken Francisco's code, and made it compile, no more and no less. >> >> I have a git repo with this work available here: >> >> http://cgit.freedesktop.org/~jwrdegoede/llvm/ >> >> So the next step would be to test this and see if it actually >> does anything, questions: >> >> 1) Does anyone have a simple test case / command where I can >> invoke just llvm and get TGSI asm output to check ? >> > > The easiest way to do this is with the llc tool which ships with llvm. > It compiles LLVM IR to target code, which in this case is tgsi. > I would recommend taking one of the simple examples from > test/CodeGen/AMDGPU (you may need to get these from llvm trunk, not sure > what llvm version you are using). > > To use llc: > > llc -march=tgsi input.ll -o - > > > This will output TGSI. > > > If you want to use clang to compile OpenCL C kernels to clang you will > need to teach clang about the TGSI target by implementing the a > sub-class of TargetInfo in lib/Basic/Targets.cpp. Look at the > AMDGPU target for examples, but I recommend starting with llc. > >> 2) Assuming I get the above to (somewhat) work, is there a >> way to make llvm show the output of the various intermediate >> passes in a human readable form ? >> > > You can pass -print-before-all or -print-after-all to dump the > intermediate forms.Thanks this is exactly what I was looking for. I'll send another status update when I've something worthwhile to report :) Regards, Hans
Hans de Goede
2015-Nov-18  14:53 UTC
[Nouveau] [Mesa-dev] llvm TGSI backend (WIP) questions
Hi, On 13-11-15 19:51, Tom Stellard wrote:> On Fri, Nov 13, 2015 at 02:46:52PM +0100, Hans de Goede wrote: >> Hi All, >> >> So as discussed I've started working on a TGSI backend for >> llvm to use as a way to get compute going on nouveau (and other gpu-s). >> >> I'm still learning all the ins and outs of llvm so I do not have >> much to show yet. >> >> I've rebased Francisco's (curro's) latest version on top of llvm >> trunk, and added a commit on top to actual get it build with the >> latest trunk. So currently I'm at the point where I've just >> taken Francisco's code, and made it compile, no more and no less. >> >> I have a git repo with this work available here: >> >> http://cgit.freedesktop.org/~jwrdegoede/llvm/ >> >> So the next step would be to test this and see if it actually >> does anything, questions: >> >> 1) Does anyone have a simple test case / command where I can >> invoke just llvm and get TGSI asm output to check ? >> > > The easiest way to do this is with the llc tool which ships with llvm. > It compiles LLVM IR to target code, which in this case is tgsi. > I would recommend taking one of the simple examples from > test/CodeGen/AMDGPU (you may need to get these from llvm trunk, not sure > what llvm version you are using). > > To use llc: > > llc -march=tgsi input.ll -o - > > > This will output TGSI.So after some bugfixing to fix a bunch of segfaults I get: $ bin/llc -march=tgsi ../test/CodeGen/AMDGPU/add.ll -o - # BB#0: UADDs TEMP0x, TEMP0x, 0 LOADgis TEMP1z, [TEMP1y] UADDs TEMP1y, TEMP1y, 4 LOADgis TEMP1y, [TEMP1y] UADDs TEMP1y, TEMP1z, TEMP1y STOREgis [TEMP1x], TEMP1y UADDs TEMP0x, TEMP0x, 0 RET ENDSUB and add.ll has: ;FUNC-LABEL: {{^}}test1: ;EG: ADD_INT {{[* ]*}}T{{[0-9]+\.[XYZW], T[0-9]+\.[XYZW], T[0-9]+\.[XYZW]}} ;SI: v_add_i32_e32 [[REG:v[0-9]+]], vcc, {{v[0-9]+, v[0-9]+}} ;SI-NOT: [[REG]] ;SI: buffer_store_dword [[REG]], define void @test1(i32 addrspace(1)* %out, i32 addrspace(1)* %in) { %b_ptr = getelementptr i32, i32 addrspace(1)* %in, i32 1 %a = load i32, i32 addrspace(1)* %in %b = load i32, i32 addrspace(1)* %b_ptr %result = add i32 %a, %b store i32 %result, i32 addrspace(1)* %out ret void } So the generated code for test1 resmbles the input somewhat but is in no way correct, e.g. I do not understand why it is assuming that both TEMP0x and TEMP1z contain the address of the array with the 2 input integers. Nor do I understand why it is using TEMP1z and TEMP1y as sources for the UADD, where it has been doing the LOAD-s to TEMP0x and and TEMP1y And then we've function test2 in add.ll ;FUNC-LABEL: {{^}}test2: ;EG: ADD_INT {{[* ]*}}T{{[0-9]+\.[XYZW], T[0-9]+\.[XYZW], T[0-9]+\.[XYZW]}} ;EG: ADD_INT {{[* ]*}}T{{[0-9]+\.[XYZW], T[0-9]+\.[XYZW], T[0-9]+\.[XYZW]}} ;SI: v_add_i32_e32 v{{[0-9]+, vcc, v[0-9]+, v[0-9]+}} ;SI: v_add_i32_e32 v{{[0-9]+, vcc, v[0-9]+, v[0-9]+}} define void @test2(<2 x i32> addrspace(1)* %out, <2 x i32> addrspace(1)* %in) { %b_ptr = getelementptr <2 x i32>, <2 x i32> addrspace(1)* %in, i32 1 %a = load <2 x i32>, <2 x i32> addrspace(1)* %in %b = load <2 x i32>, <2 x i32> addrspace(1)* %b_ptr %result = add <2 x i32> %a, %b store <2 x i32> %result, <2 x i32> addrspace(1)* %out ret void } Which completely makes the tgsi backend unhappy: LLVM ERROR: Cannot select: t43: i32,ch = load<LD4[FixedStack0](align=16)> t45:1, FrameIndex:i32<0>, undef:i32 t41: i32 = FrameIndex<0> t8: i32 = undef In function: test2 Any hints on where to start looking with fixing these issues would be much appreciated. Regards, Hans