Tom Stellard <tom at stellard.net> writes:>[...]
> Hi Francisco,
>
Hi Tom,
> I would be happy to be a mentor for this project if it is accepted. I
> have a few comments about your proposal:
>
Great.
>> I'm attaching a preliminary version of my proposal -- would be
happy to
>> get some feedback about it.
>>
>
>> GSoC proposal: TGSI compiler back-end.
>>
>> - Proposal
>>
>> TGSI is the intermediate representation that all open-source GPU
>> drivers using the Gallium3D architecture understand. Until now
it's
>> mainly been used for graphics (vertex, fragment shaders, etc.), but
>> doing general-purpose computing with it is possible in principle
>> (actually, necessary for GL4), and it's been the object of a number
of
>> extensions and improvements to make it more suitable for that purpose.
>>
>> The TGSI IR has some peculiarities that are unusual in a typical CPU
>> instruction architecture (and slightly annoying to deal with) --
It's
>> a vector-centric architecture with a variable set of typeless
>> registers, no stack and no proper support for irreducible control
>> flow.
>>
>> The objective of this project would be to write an LLVM compiler
>> back-end with the TGSI IR as target.
>>
>> - Benefits
>>
>> This back-end is the last piece missing for a working and fully
>> open-source implementation of OpenCL running on the nVidia nv50 and
>> nve4 architectures -- though there's nothing nVidia-specific in the
>> TGSI language, and code generated by this back-end will be expected to
>> be usable by any other driver implementing the compute API of
>> Gallium3D.
>>
>> - Biographical background
>>
>> I'm currently a masters student in the field of theoretical
physics.
>>
>> I've already (successfully) participated in the GSoC program with a
>> device driver development project (which had to do with
>> reverse-engineering nVidia's TV encoders) mentored by the X.Org
>> Foundation in 2009, after that I've remained a frequent contributor
to
>> the Nouveau and Mesa projects for the next few years.
>>
>> Last year I wrote most of an OpenCL implementation running on nVidia
>> hardware as part of the X.Org Foundation's EVoC program [1] -- the
only
>> piece missing being the compiler.
>>
>> I've gained some experience with LLVM by writing a proof-of-concept
>> TGSI back-end which is minimally working [2] -- the goal of this
>> project would be to bring it to a useful state.
>>
>> - Timeline
>>
>> Summary of the work that would be done:
>
> I'm not sure what the current status of your TGSI backend, but I would
> recommend getting assembly generation working first, since this will
> enable you to write lit tests.
>
That already sort of works... The only thing is that the assembly files
that it produces are somewhat non-standard because they include section
annotations and other unusual syntax that wouldn't be recognized by the
normal TGSI parser... It might be worth looking into it at some point
but I don't think it's very high-priority, what I have seems to be
enough to make lit happy.
>>
>> * Get object file generation working.
>> (approx. June 17 - July 8)
>>
>> The output format will be the one expected by Mesa. The
>> implementation will take advantage of the existing MC assembler
>> API as much as possible.
>>
>
> Can you elaborate a little more on the output format you will be using?
> For example, will you be generating ELF binaries with special metadata
> sections (This is what R600 currently does) or will you be creating your
> own object format.
I'd be fine with using ELF, but it would definitely need special
metadata sections as you say (for kernel prototypes and so on), and
clover would have to be fixed to deal with it correctly -- OTOH the
minimalistic format implemented in 'clover/core/module.cpp' seems to do
everything we need, so another option would be to stick to it.
>
>> * Fix handling of the multiple OpenCL address spaces.
>> (approx. July 8 - July 22)
>>
>> Operations on __global, __local, and __private memory will be
>> dealt with using the resource access opcodes, __kernel function
>> parameters will be accessed through a special resource meant for
>> parameter passing, __constant memory will be mapped to constant
>> buffers.
>>
>> * Get function calls working reliably.
>> (approx. July 22 - August 5)
>>
>> This will involve fixing the passing of aggregate types and
>> anything that doesn't fit in a 32-bit register, fixing stack
>> allocations (i.e. the "alloca" instruction), and fixing
calls to
>> functions that use the "kernel" calling convention from
non-kernel
>> functions.
>>
>> * Get control flow working reliably.
>> (approx. August 5 - August 19)
>>
>> This will involve writing a control flow structurizing pass -- It
>> might be possible to promote the R600 one to a common analysis
>> pass and reuse it.
>>
>
> I have a feeling this task may take longer than two weeks. When you
> write the final version of your proposal, I think you should have a
> definitive plan for how you will implement the structurization. Whether
> it's reusing existing R600 code (this is my recommendation) or writing
> something from scratch.
>
Reusing the R600 code would be possible for sure with just a few
changes, but I think it would be nice to split the algorithm in an
"analysis" and a "transformation" pass to leave the target
the choice on
how irreducible edges in the CFG should be handled -- Depending on the
hardware and the specific case it might be better to remove irreducible
edges by duplicating basic blocks, by introducing temporary "control
flow" variables (as the SI structurizer does), or by not doing anything
at all (e.g. on nVidia hardware arbitrary branches are actually
supported, they're just somewhat inefficient).
It would also be nice if inter-pass dependencies were handled correctly
and we didn't have to disable any other optimization passes that
decanonicalize the control flow as R600 does.
I agree that two weeks might be too little for what I have in mind, but
I guess if we drop the standard library point below (or we make it
optional) it should be plenty of time to do it right.
> Also, I would really prefer if your structurization solution was target
> independent and could live outside of the backend in the common code,
> because a good structurization solution would be a great benefit to the
> LLVM project.
Yeah, that was my idea too.
>
>> * Get the missing arithmetic and data conversion instructions
>> working.
>> (approx. August 19 - August 26)
>>
>> Most of the floating point, integer and vector operations required
>> by the OpenCL spec will be functional by the end of this period.
>>
>> * Work on the standard library and intrinsics.
>> (approx. August 26 - September 16)
>>
>> This will involve getting a reasonable subset of the OpenCL
>> standard library working, including math functions, thread
>> synchronisation functions, atomic functions, memory barriers and
>> surface sampling/write-back functions.
>
> I'm assuming you are planning to use libclc (http://libclc.llvm.org)
for
> this.
>
Yes.
> While implementing standard library builtins is important, I think this
> task may be a little bit outside the scope of this project. I would
> recommend dropping this from the schedule and adding it as a task to
> work on if you finish everything else early. This way you can give
> yourself more time to work on the actual backend.
OK, I'll make this one optional.
>>
>> * Documentation and remaining clean-up work.
>> (approx. September 16 - September 23)
>>
>
> I think your proposal should also include a plan for getting the backend
> into mainline LLVM, because this is really the ultimate goal of the
> project. Your plan should include where the code repository will be
> stored and how you will engage with the community to help you review
> the code. I think this is really important no just for you, but also
> for the LLVM community to know what they need to do as far as helping
> get the backend into the main tree.
>
I'm a little lost on this point... My plan is just to keep working on
it until it's good enough to be considered suitable for mainline,
meanwhile it could live in a separate repository in freedesktop or
github. Not sure what else would be expected from me -- of course, I'm
willing to keep fixing bugs, API breakages and reviewing related patches
once it's merged to mainline.
>
>> By the end of each period all the relevant OpenCL language tests from
>> the piglit suite [3] and opencl-example [4] will be expected to pass.
>> New tests will be written for implemented features that don't have
>> sufficient coverage from the existing test suites.
>>
>
> I know you'll be using the nouveau drivers to test this backend on real
> hardware, and I think that's OK, but I do think you need to be careful
> about not spending too much time fixing bugs in the nouveau driver. I
> think piglit passes is a good goal, but I would also like to see OpenCL
> or LLVM IR based lit tests added as a goal, because TGSI code gen
> is the main focus of this project.
>
Yes, good point, I agree that for now it would make more sense to focus
on having extensive coverage in form of lit tests.
> Thank you for submitting an early draft of your proposal, I think it is
> really good to get developer feedback early. I would encourage you to
> continue to submit drafts up until the deadline to maximize the input
> you get from LLVM developers.
>
OK, I will do that. Thank you for taking the time to read and comment
on my proposal.
> -Tom
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 229 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130423/add3a524/attachment.sig>