thr3ads.net - llvm dev - [llvm-dev] [RFC] design doc for straight-line scalar optimizations [Aug 2015]

If this information is useful, please help other people find it:
Share via:

Jingyue Wu via llvm-dev

2015-Aug-24 18:10 UTC

[llvm-dev] [RFC] design doc for straight-line scalar optimizations

Hi,

As you may have noticed, since last year, we (Google's CUDA compiler team)
have contributed quite a lot to the effort of optimizing LLVM for CUDA
programs. I think it's worthwhile to write some docs to wrap them up for
two reasons.
1) Whoever wants to understand or work on these optimizations has some
detailed docs instead of just source code to refer to.
2) RFC on how to improve these optimizations so that other targets can
benefit from them as well. They are currently mostly restricted to the
NVPTX backend, but I see many potentials to generalize them.

So, I started from this overdue design doc
<https://docs.google.com/document/d/1momWzKFf4D6h8H3YlfgKQ3qeZy5ayvMRh6yR-Xn2hUE/edit?usp=sharing>
on the
straight-line scalar optimizations. I will send out more docs on other
optimizations later. Please feel free to comment.

Thanks,
Jingyue
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150824/945137d8/attachment.html>

escha via llvm-dev

2015-Aug-25 01:43 UTC

head link

[llvm-dev] [RFC] design doc for straight-line scalar optimizations

> On Aug 24, 2015, at 11:10 AM, Jingyue Wu via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Hi, 
> 
> As you may have noticed, since last year, we (Google's CUDA compiler
team) have contributed quite a lot to the effort of optimizing LLVM for CUDA
programs. I think it's worthwhile to write some docs to wrap them up for two
reasons.
> 1) Whoever wants to understand or work on these optimizations has some
detailed docs instead of just source code to refer to.
> 2) RFC on how to improve these optimizations so that other targets can
benefit from them as well. They are currently mostly restricted to the NVPTX
backend, but I see many potentials to generalize them.
> 
> So, I started from this overdue design doc
<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1momWzKFf4D6h8H3YlfgKQ3qeZy5ayvMRh6yR-2DXn2hUE_edit-3Fusp-3Dsharing&d=BQMFaQ&c=eEvniauFctOgLOKGJOplqw&r=szS1_DDBoKCtS8B5df7mJg&m=TggebUNOWYFU5W3tKpC_z1CkNT9MN05aBwWloSru2NI&s=vmPxp-RDJuf_ZN5X7LNlV10JwuHK5Pt1ljn96IenW-o&e=>
on the straight-line scalar optimizations. I will send out more docs on other
optimizations later. Please feel free to comment.
> 
> Thanks, 
> Jingyue
Out of curiosity, is there any plan to make the NVPTX-originated passes
(separateconstantoffsetfromgep, slsr, naryreassociate) more generic? They seem
very specialized for the nVidia GPU addressing modes despite the generic names,
and in my tests tend to pessimize our target more often than not for that
reason.

It’d be really nice to have something more generic, and I might look into
helping with that sort of thing in the future if it becomes important for us.

—escha
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150824/c0e44de6/attachment.html>

Jingyue Wu via llvm-dev

2015-Aug-25 01:52 UTC

head link

[llvm-dev] [RFC] design doc for straight-line scalar optimizations

Hi Escha,

We certainly would love to generalize them as long as the performance
doesn't suffer in general. If you have specific use cases that are
regressed due to these optimizations, I am more than happy to take a look.

On Mon, Aug 24, 2015 at 6:43 PM, escha <escha at apple.com> wrote:
>
> On Aug 24, 2015, at 11:10 AM, Jingyue Wu via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Hi,
>
> As you may have noticed, since last year, we (Google's CUDA compiler
team)
> have contributed quite a lot to the effort of optimizing LLVM for CUDA
> programs. I think it's worthwhile to write some docs to wrap them up
for
> two reasons.
> 1) Whoever wants to understand or work on these optimizations has some
> detailed docs instead of just source code to refer to.
> 2) RFC on how to improve these optimizations so that other targets can
> benefit from them as well. They are currently mostly restricted to the
> NVPTX backend, but I see many potentials to generalize them.
>
> So, I started from this overdue design doc
>
<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1momWzKFf4D6h8H3YlfgKQ3qeZy5ayvMRh6yR-2DXn2hUE_edit-3Fusp-3Dsharing&d=BQMFaQ&c=eEvniauFctOgLOKGJOplqw&r=szS1_DDBoKCtS8B5df7mJg&m=TggebUNOWYFU5W3tKpC_z1CkNT9MN05aBwWloSru2NI&s=vmPxp-RDJuf_ZN5X7LNlV10JwuHK5Pt1ljn96IenW-o&e=>
on the
> straight-line scalar optimizations. I will send out more docs on other
> optimizations later. Please feel free to comment.
>
> Thanks,
> Jingyue
>
>
> Out of curiosity, is there any plan to make the NVPTX-originated passes
> (separateconstantoffsetfromgep, slsr, naryreassociate) more generic? They
> seem very specialized for the nVidia GPU addressing modes despite the
> generic names, and in my tests tend to pessimize our target more often than
> not for that reason.
>
> It’d be really nice to have something more generic, and I might look into
> helping with that sort of thing in the future if it becomes important for
> us.
>
> —escha
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150824/3c6ee3d0/attachment.html>

Eli Bendersky via llvm-dev

2015-Aug-25 15:52 UTC

head link

[llvm-dev] [RFC] design doc for straight-line scalar optimizations

On Mon, Aug 24, 2015 at 6:43 PM, escha via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
>
> On Aug 24, 2015, at 11:10 AM, Jingyue Wu via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Hi,
>
> As you may have noticed, since last year, we (Google's CUDA compiler
team)
> have contributed quite a lot to the effort of optimizing LLVM for CUDA
> programs. I think it's worthwhile to write some docs to wrap them up
for
> two reasons.
> 1) Whoever wants to understand or work on these optimizations has some
> detailed docs instead of just source code to refer to.
> 2) RFC on how to improve these optimizations so that other targets can
> benefit from them as well. They are currently mostly restricted to the
> NVPTX backend, but I see many potentials to generalize them.
>
> So, I started from this overdue design doc
>
<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1momWzKFf4D6h8H3YlfgKQ3qeZy5ayvMRh6yR-2DXn2hUE_edit-3Fusp-3Dsharing&d=BQMFaQ&c=eEvniauFctOgLOKGJOplqw&r=szS1_DDBoKCtS8B5df7mJg&m=TggebUNOWYFU5W3tKpC_z1CkNT9MN05aBwWloSru2NI&s=vmPxp-RDJuf_ZN5X7LNlV10JwuHK5Pt1ljn96IenW-o&e=>
on the
> straight-line scalar optimizations. I will send out more docs on other
> optimizations later. Please feel free to comment.
>
> Thanks,
> Jingyue
>
>
> Out of curiosity, is there any plan to make the NVPTX-originated passes
> (separateconstantoffsetfromgep, slsr, naryreassociate) more generic? They
> seem very specialized for the nVidia GPU addressing modes despite the
> generic names, and in my tests tend to pessimize our target more often than
> not for that reason.
>
> It’d be really nice to have something more generic, and I might look into
> helping with that sort of thing in the future if it becomes important for
> us.
>
> —escha
>
>To add to Jingyue's answer - the reason these passes are not more generic
is very pragmatic - we've just optimized them for the NVIDIA targets we
care about and can run extensive benchmarking on. There's absolutely no
problem generalizing them if someone's interested - in fact, we'd be
happy
to see that happen. This is what open-source is for :) IIRC some of the
optimization work was already generalized by the AMD backend folks, and
more can be done for sure. While PTX has its specific characteristics, many
of the general issues with GPU-oriented optimizations are common to other
GPU architectures and can be generalized in IR level passes.

Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150825/0dd64053/attachment.html>

Philip Reames via llvm-dev

2015-Aug-25 20:35 UTC

head link

[llvm-dev] [RFC] design doc for straight-line scalar optimizations

Thank you for sharing this.  It made for extremely interesting reading.

Philip

On 08/24/2015 11:10 AM, Jingyue Wu via llvm-dev wrote:> Hi,
>
> As you may have noticed, since last year, we (Google's CUDA compiler 
> team) have contributed quite a lot to the effort of optimizing LLVM 
> for CUDA programs. I think it's worthwhile to write some docs to wrap 
> them up for two reasons.
> 1) Whoever wants to understand or work on these optimizations has some 
> detailed docs instead of just source code to refer to.
> 2) RFC on how to improve these optimizations so that other targets can 
> benefit from them as well. They are currently mostly restricted to the 
> NVPTX backend, but I see many potentials to generalize them.
>
> So, I started from this overdue design doc 
>
<https://docs.google.com/document/d/1momWzKFf4D6h8H3YlfgKQ3qeZy5ayvMRh6yR-Xn2hUE/edit?usp=sharing>
on the
> straight-line scalar optimizations. I will send out more docs on other 
> optimizations later. Please feel free to comment.
>
> Thanks,
> Jingyue
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150825/7dd42a57/attachment-0001.html>

Apparently Analagous Threads

Search for more apparently analagous threads

llvm dev - Aug 2015 - [RFC] design doc for straight-line scalar optimizations

[llvm-dev] [RFC] design doc for straight-line scalar optimizations

[llvm-dev] [RFC] design doc for straight-line scalar optimizations

[llvm-dev] [RFC] design doc for straight-line scalar optimizations

[llvm-dev] [RFC] design doc for straight-line scalar optimizations

[llvm-dev] [RFC] design doc for straight-line scalar optimizations

Apparently Analagous Threads