thr3ads.net - llvm dev - [llvm-dev] Clang option to provide list of target-subarchs. [Feb 2017]

If this information is useful, please help other people find it:
Share via:

Justin Lebar via llvm-dev

2017-Feb-07 22:07 UTC

[llvm-dev] Clang option to provide list of target-subarchs.

In principle this sounds fine to me.  Some questions/comments:

* You can specify --cuda-gpu-arch multiple times, and each time adds a
new CUDA arch.

How is this going to work with --target-subarchs?  Is there going to
be a --no-target-subarchs flag to disable subarchs?  What will the
semantics of this be, exactly?

The semantics of flags that mean "compile for this one subarch" and
"don't compile for this one subarch" seem a lot more
straightforward
than flags that deal in lists.  What are your thoughts about making it
work that way instead?

* I don't think "target subarch" is a great name.  I don't
think most
people think of GPU architectures as "sub-architectures";
"subarch"
implies to me that there are different GPU archs for each CPU arch,
which obviously isn't the case.

Similarly, what problem are we solving by putting "target" in the flag
name?  We already have e.g. -march; it's not -mtarget-arch.

"--offload-arch", maybe?

* As discussed offline, we'll need to continue supporting the existing
flags, e.g. --cuda-gpu-arch, probably forever.

* There are a bunch of other flags we may want to harmonize.  For
example, I believe CUDA and opencl both have separate
flush-denormals-to-zero flags.

On Tue, Feb 7, 2017 at 1:39 PM, Eric Christopher <echristo at gmail.com>
wrote:> Adding Justin as well.
>
> Overall this seems reasonable to me depending on the actual patch :)
>
> -eric
>
>
> On Mon, Feb 6, 2017 at 7:23 PM Rodgers, Gregory via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>>
>> There are at least four clang frontends for offloading to accelerators:
>> 1 Cuda clang  2 OpenMP  3 HCC and 4 OpenCL.    These frontends will
>> want to embed object code for multiple offload targets into a single
>> application binary to provide portability across different
>> subarchitectures
>> (e.g. sm_35, sm_50) and across different architectures (e.g
>> nvptx64,amdgcn).
>>
>> Problem:  Different frontends are using different flags to provide a
>> list of subarchitectures.  For example, cuda clang repeats the flag
>> “--cuda-gpu-arch=sm_35  --cuda-gpu-arch=sm_50” and HCC uses
>> “--amdgpu-target=gfx701 --amdgpu-target=gfx802”.
>>
>> We propose a common clang flag to provide a list of target
>> subarchitectures called “target-subarchs”. For example,
>>
>> --target-subarchs=sm_35,sm_50,gfx701,gfx802
>>
>> In discussions with HCC and OpenMP maintainers,  we believe a new
>> flag name would have these requirements:
>>    end in “s” because it is a list;
>>    not have vendor specific names like cuda and amd;
>>    not contain “gpu” because offloading may extend to non-gpu archs;
>>    avoid “arch” by itself so as not to be confused with first field of
a
>> triple;
>>    and not collide with existing flags to allow both options.
>>
>> "--target-subarchs" satisfies all the above.  Comments?
>>
>> Greg Rodgers
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Rodgers, Gregory via llvm-dev

2017-Feb-07 23:18 UTC

head link

[llvm-dev] Clang option to provide list of target-subarchs.

Thank you for the feedback.
> How is this going to work with --target-subarchs?  Is there going to be a
--no-target-subarchs flag to disable subarchs?  What will the semantics of this
be, exactly?
The large number of subarchs expected makes an inclusive only flag desirable and
an exclusive flag impractical.   Also, since subarchs will age more quickly than
archs, who knows what old crufty subarchs you would get with an exclusion flag. 
We expect that the runtime will match the most appropriate subarch.
  
As is currently done with --cuda-gpu-arch, we expect that the triple for the
arch will be implied from the context.   That is, if one specifies
--target-subarchs="sm_50,gfx702", the software will generate the
triples "nvptx64-nvidia-cuda" and "amdgcn--cuda" from the
subarchs.    Collisions (different archs) for the same subarch are unlikely and
indicate a poor choice of subarch names.   For example, AMD should never choose
sm_ prefix for its subarchs.
> ... than flags that deal in lists.  What are your thoughts about making it
work that way instead?The semantics of repeating a flag for each desired object verses a list does
ease typing, which may not be justification enough.   But when they get lost and
separated in long option lists, it could be frustrating.   Using a list,
improves readability of scripts.  As we said, existing flags would still be
supported.
> what problem are we solving by putting "target" in the flag name?
We already have e.g. -march; it's not -mtarget-arch.
"--offload-arch", maybe?
There are no problems solved with the word "target".  The genesis for
me of this name is the association with OpenMP target pragmas used for
offloading.    target is a noun and offload is a verb.  We desire a list of
objects that end in s .    I am ok with archs instead of subarchs because it
continues to imply some relationship with the arch field of the triple.

I am ok with "--offload-archs"  .   If anyone has an issue with
--offload-archs, please raise them here.

Thank you 

Greg

Justin Lebar via llvm-dev

2017-Feb-07 23:23 UTC

head link

[llvm-dev] Clang option to provide list of target-subarchs.

> The large number of subarchs expected makes an inclusive only flag
desirable and an exclusive flag impractical.
Sorry, I think I wasn't clear about what I meant.

I did not mean that --no-target-subarchs=X would enable all known
subarchs other than X.  That would, as you say, cause problems due to
the large number of subarchs we support.  Instead, I meant that
--target-subarchs=X,Y --no-target-subarchs=Y,Z would build only
subarch X.

This is similar to many other flags in clang.  See for example -Wfoo
and -Wno-foo.  These can override each other; the last one wins.

This is important so that scripts can refine the compile flags
provided by other scripts.  We currently do this inside of Google with
--cuda-gpu-arch -- it is not a contrived use-case.

For basically the same reasons that we allow -Wfoo and -Wno-foo to
appear in the same command line invocation, I think we should allow
--offload-archs=X and --no-offload-archs=X to appear in the same
invocation.

On Tue, Feb 7, 2017 at 3:18 PM, Rodgers, Gregory
<Gregory.Rodgers at amd.com> wrote:> Thank you for the feedback.
>
>> How is this going to work with --target-subarchs?  Is there going to be
a --no-target-subarchs flag to disable subarchs?  What will the semantics of
this be, exactly?
>
> The large number of subarchs expected makes an inclusive only flag
desirable and an exclusive flag impractical.   Also, since subarchs will age
more quickly than archs, who knows what old crufty subarchs you would get with
an exclusion flag.   We expect that the runtime will match the most appropriate
subarch.
>
> As is currently done with --cuda-gpu-arch, we expect that the triple for
the arch will be implied from the context.   That is, if one specifies
--target-subarchs="sm_50,gfx702", the software will generate the
triples "nvptx64-nvidia-cuda" and "amdgcn--cuda" from the
subarchs.    Collisions (different archs) for the same subarch are unlikely and
indicate a poor choice of subarch names.   For example, AMD should never choose
sm_ prefix for its subarchs.
>
>> ... than flags that deal in lists.  What are your thoughts about making
it work that way instead?
> The semantics of repeating a flag for each desired object verses a list
does ease typing, which may not be justification enough.   But when they get
lost and separated in long option lists, it could be frustrating.   Using a
list, improves readability of scripts.  As we said, existing flags would still
be supported.
>
>> what problem are we solving by putting "target" in the flag
name?  We already have e.g. -march; it's not -mtarget-arch.
"--offload-arch", maybe?
>
> There are no problems solved with the word "target".  The genesis
for me of this name is the association with OpenMP target pragmas used for
offloading.    target is a noun and offload is a verb.  We desire a list of
objects that end in s .    I am ok with archs instead of subarchs because it
continues to imply some relationship with the arch field of the triple.
>
> I am ok with "--offload-archs"  .   If anyone has an issue with
--offload-archs, please raise them here.
>
> Thank you
>
> Greg

llvm dev - Feb 2017 - Clang option to provide list of target-subarchs.

[llvm-dev] Clang option to provide list of target-subarchs.

[llvm-dev] Clang option to provide list of target-subarchs.

[llvm-dev] Clang option to provide list of target-subarchs.