thr3ads.net - llvm dev - [llvm-dev] [RFC] Refactor class hierarchy of VectorType in the IR [May 2020]

If this information is useful, please help other people find it:
Share via:

Chris Tetreault via llvm-dev

2020-May-22 07:15 UTC

[llvm-dev] [RFC] Refactor class hierarchy of VectorType in the IR

John,

   For the last several months, those of us working on the scalable vectors
feature have been examining the codebase, identifying places where
llvm::VectorType is used incorrectly, and fixing them. The fact is that there
are many places where VectorType is correctly taken to be the generic “any
vector” type. getNumElements may be being called, but it’s being called in
accordance with the previously documented semantics. There are many places where
it isn’t as well, and many people add new usages that are incorrect.

   This puts us in an unfortunate situation: if we were to take your proposal
and have VectorType be the fixed width vector type, then all of this work is
undone. Every place that has been fixed up to correctly have VectorType be used
as a universal vector type will now incorrectly have the fixed width vector type
being used as the universal vector type. Since VectorType will inherit from
BaseVectorType, it will have inherited the getElementCount(), so the compiler
will happily continue to compile this code. However, none of this code will even
work with scalable vectors because the bool will always be false. There will be
no compile time indication that this is going on, functions will just start
mysteriously returning nullptr. Earlier this afternoon, I set about seeing how
much work it would be to change the type names as you have suggested. I do not
see any way forward other than painstakingly auditing the code.

   On the other hand, creating a new fixed width vector type has a clear upgrade
path. 1) delete getNumElements() from the base class locally. 2) try to build 3)
fix the failures, uploading patches for these fixes 4) once step 3 is completed
throughout the codebase, merge the patch to remove getNumElements() from
VectorType. Downstream and out-of-tree codebases have this upgrade path as well.
There exists no easy upgrade path if we go the other way and have VectorType
become a specifically fixed-width vector type.

   Basically, I believe that the best thing to do is to move forward with the
type names that I have proposed:


  1.  The type names more accurately represent the usage of the types.
  2.  Changing course now would result in a tremendous amount of rework being
done in the upstream codebase. This will have a significant impact on the pace
of development in upstream.
  3.  The process for completing the changes would be much easier if we use the
types I propose. The compiler can tell you if you’re using getNumElements() on a
potentially scalable vector. The compiler cannot tell you that
SomeFixedVector->getElementCount().Scalable and
isa<ScalableVectorType>(SomeFixedVector) are always false.

   Additionally, for those who disagree that the LLVM developer policy is to
disregard the needs of downstream codebases when making changes to upstream, I
submit that not throwing away months of work by everybody working to fix the
codebase to handle scalable vectors represents a real expected benefit. I
personally have been spending most of my time on this since January.

Thanks,
   Christopher Tetreault

From: John McCall <rjmccall at apple.com>
Sent: Thursday, May 21, 2020 3:32 PM
To: Chris Tetreault <ctetreau at quicinc.com>
Cc: llvm-dev at lists.llvm.org
Subject: [EXT] Re: [llvm-dev] [RFC] Refactor class hierarchy of VectorType in
the IR


On 21 May 2020, at 17:22, Chris Tetreault wrote:

John,

This is not categorically true, no. When we make changes that require
large-scale updates for downstream codebases, we do so because there’s a real
expected benefit to it. For the most part, we do make some effort to keep
existing source interfaces stable.

While I’m at a loss to find a documented policy, I recall this thread
(http://lists.llvm.org/pipermail/llvm-dev/2020-February/139207.html) where this
claim was made and not disputed. The expected real benefits to this change are:
1) The names now match the semantics 2) It is now statically impossible to
accidentally get the fixed number of elements from a scalable vector and 3) It
forces everybody to fix their broken code. If we provided stability guarantees
to downstream and out-of-tree codebases, then I might not agree that 3 is a
benefit, but my understanding is that we do not provide this guarantee.

… Probably 99% of the code using VectorType semantically requires it to be a
fixed-width vector.

This code is all broken already. I don’t think supporting common misuse of APIs
in a codebase that does not provide stability guarantees is something we should
be doing.

The generalization of VectorType to scalable vector types was a representational
shortcut that should never have been allowed

I agree. Unfortunately it happened, and our choices are to fix it or accept the
technical debt forever.

Perhaps this is part of the root of our disagreement. I consider scalable vector
types to be an experimental/unstable feature, as many features are when they’re
first added to the compiler. I have much lower standards for disrupting the
early adopters of features like that; if scalable vectors need to be split out
of VectorType, we should just do it.

Analogously, when we upstream the pointer-authentication feature, we will be
upstreaming a rather flawed representation that I definitely expect us to revise
after the fact. That will be problematic for early adopters of LLVM’s pointer
authentication support, but that’s totally acceptable.

John.

the burden of generalization should usually fall on the people who need to use
the generalization, and otherwise we should aim to keep APIs stable when there’s
no harm to it.

The burden of creating the generalization should be placed on those who need it,
I agree. However, once the generalization is in place, the burden is on
everybody to use it correctly. The reason I’ve undertaken this refactor is
because I found myself playing whack-a-mole with people adding new broken code
after the fact. The previous API was very easy to misuse, so I don’t blame those
people.

Are you actually auditing and testing them all to work for scalable vector
types, or are you just fixing the obvious compile failures?

Everywhere that VectorType::getNumElements() is being called, we are either
changing the code to cast to FixedVectorType, or we are updating the code to
handle scalable vectors correctly. If the call site does not have test coverage
with scalable vectors, this test coverage is being added. Even for obviously
correct transformations such as `VectorType::get(SomeTy,
SomeVecTy->getNumElements())` -> `VectorType::get(SomeTy,
SomeVecTy->getElementCount())`, I have been required in code review to
provide test coverage. We are taking this seriously.

“Vector” has a traditional and dominant meaning as a fixed-width SIMD type, and
the fact that you’ve introduced a generalization doesn’t change that. Clang
supports multiple kinds of pointer, but we still reserve clang::PointerType for
C pointers instead of making it an abstract superclass, thus letting our sense
of logic introduce a million bugs through accidental generalization throughout
the compiler.

Various languages implemented on top of LLVM have various different pointer
types. However, the LLVM IR language only has one pointer type. The LLVM IR
language has two types of vectors, and I think it’s reasonable to model them as
a class hierarchy in this manner. I’m not familiar with the clang::PointerType
situation, so I cannot pass judgement on it.

Thanks,
Christopher Tetreault

From: John McCall <rjmccall at apple.com<mailto:rjmccall at
apple.com>>
Sent: Thursday, May 21, 2020 1:47 PM
To: Chris Tetreault <ctetreau at quicinc.com<mailto:ctetreau at
quicinc.com>>
Cc: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
Subject: [EXT] Re: [llvm-dev] [RFC] Refactor class hierarchy of VectorType in
the IR


On 21 May 2020, at 16:01, Chris Tetreault wrote:

Hi John,

I’d like to address some points in your message.

Practically speaking, this is going to break every out-of-tree frontend,
backend, or optimization pass that supports SIMD types.

My understanding is that the policy in LLVM development is that we do not let
considerations for downstream and out-of-tree codebases affect the pace of
development.

This is not categorically true, no. When we make changes that require
large-scale updates for downstream codebases, we do so because there’s a real
expected benefit to it. For the most part, we do make some effort to keep
existing source interfaces stable.

The C++ API is explicitly unstable. I maintain a downstream fork of LLVM myself,
so I know the pain that this is causing because I get to fix all the issues in
my internal codebase. However, the fix for downstream codebases is very simple:
Just find all the places where it says VectorType, and change it to say
FixedVectorType.

… by having the VectorType type semantically repurposed out from under them.

The documented semantics of VectorType prior to my RFC were that it is a
generalization of all vector types. The VectorType contains an ElementCount,
which is a pair of (bool, unsigned). If the bool is true, then the return value
of getNumElements() is the minimum number of vector elements. If the bool is
false, then it is the actual number of elements. My RFC has not changed these
semantics. It will eventually delete a function that has been pervasively
misused throughout the codebase, but the semantics remain the same. You are
proposing a semantic change to VectorType to have it specifically be a fixed
width vector.

Probably 99% of the code using VectorType semantically requires it to be a
fixed-width vector. The generalization of VectorType to scalable vector types
was a representational shortcut that should never have been allowed; it should
always have used a different type. Honestly, I’m not convinced your abstract
base type is even going to be very useful in practice vs. just having a
getVectorElementType() accessor that checks for both and otherwise returns null.

… a particular largely-vendor-specific extension …

All SIMD vectors are vendor specific extensions. Just because most of the most
popular architectures have them does not make this not true. AArch64 and RISC-V
have scalable vectors, so it is not just one architecture. It is the
responsibility of all developers to ensure that they use the types correctly. It
would be nice if the obvious thing to do is the correct thing to do.

I’m not saying that we shouldn’t support scalable vector types. I’m saying that
the burden of generalization should usually fall on the people who need to use
the generalization, and otherwise we should aim to keep APIs stable when there’s
no harm to it.

… it’s much better for code that does support both to explicitly opt in by
checking for and handling the more general type …

This is how it will work. I am in the process of fixing up call sites that make
fixed width assumptions so that they use FixedVectorType.

Are you actually auditing and testing them all to work for scalable vector
types, or are you just fixing the obvious compile failures? Because scalable
vector types impose some major restrictions that aren’t imposed on normal
vectors, and the static type system isn’t going to catch most of them.

I think that it is important to ensure that things have clear sensible names,
and to clean up historical baggage when the opportunity presents.

“Vector” has a traditional and dominant meaning as a fixed-width SIMD type, and
the fact that you’ve introduced a generalization doesn’t change that. Clang
supports multiple kinds of pointer, but we still reserve clang::PointerType for
C pointers instead of making it an abstract superclass, thus letting our sense
of logic introduce a million bugs through accidental generalization throughout
the compiler.

You have resigned yourself to doing a lot of work in pursuit of something that I
really don’t think is actually an improvement.

John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200522/d68fc47d/attachment-0001.html>

John McCall via llvm-dev

2020-May-22 19:58 UTC

head link

[llvm-dev] [RFC] Refactor class hierarchy of VectorType in the IR

On 22 May 2020, at 3:15, Chris Tetreault wrote:> John,
>
>    For the last several months, those of us working on the scalable 
> vectors feature have been examining the codebase, identifying places 
> where llvm::VectorType is used incorrectly, and fixing them. The fact 
> is that there are many places where VectorType is correctly taken to 
> be the generic “any vector” type. getNumElements may be being 
> called, but it’s being called in accordance with the previously 
> documented semantics. There are many places where it isn’t as well, 
> and many people add new usages that are incorrect.
>
>    This puts us in an unfortunate situation: if we were to take your 
> proposal and have VectorType be the fixed width vector type, then all 
> of this work is undone. Every place that has been fixed up to 
> correctly have VectorType be used as a universal vector type will now 
> incorrectly have the fixed width vector type being used as the 
> universal vector type. Since VectorType will inherit from 
> BaseVectorType, it will have inherited the getElementCount(), so the 
> compiler will happily continue to compile this code. However, none of 
> this code will even work with scalable vectors because the bool will 
> always be false. There will be no compile time indication that this is 
> going on, functions will just start mysteriously returning nullptr. 
> Earlier this afternoon, I set about seeing how much work it would be 
> to change the type names as you have suggested. I do not see any way 
> forward other than painstakingly auditing the code.
If you define `getElementCount() = delete` in `VectorType`, you can 
easily find the places that are doing this and update them to use 
`VectorBaseType`.  You wouldn’t actually check that in, of course; 
it’s a tool for doing the audit in a way that’s no more painstaking 
than what you’re already doing with `getNumElements()`.  And in the 
meantime, the code that you haven’t audited — the code that’s 
currently unconditionally calling `getNumElements()` on a `VectorType` 
— will just conservatively not trigger on scalable vectors, which for 
most of LLVM is a better result than crashing if a scalable vector comes 
through until your audit gets around to updating it.
>    Additionally, for those who disagree that the LLVM developer policy 
> is to disregard the needs of downstream codebases when making changes 
> to upstream, I submit that not throwing away months of work by 
> everybody working to fix the codebase to handle scalable vectors 
> represents a real expected benefit. I personally have been spending 
> most of my time on this since January.
I’m responding to this as soon as I heard about it.  I’ll accept 
that ideally I would have seen it when you raised the RFC in March, 
although in practice it’s quite hard to proactively keep up with 
llvmdev, and as a community I think we really need to figure out a 
better process for IR design.  I’m not going to feel guilty about work 
you did for over a month without raising an RFC.  And I really don’t 
think you have in any way wasted your time; I am asking for a large but 
fairly mechanical change to the code you’ve already been updating.

But most of your arguments are not based on how much work you’ve done 
on your current audit, they’re based on the fact that scalable vectors 
were initially implemented as a flag on `VectorType`.  So part of my 
problem here is that you’re basically arguing that, as soon as that 
was accepted, the generalization of `VectorType` was irreversible; and 
that’s a real problem, because it’s very common for early prototype 
work to not worry much about representations, and so they stumble into 
this kind of problematic representation.

My concern is really only ~50% that this is going to force a lot of 
unnecessary mechanical changes for downstream projects and 50% that 
generalizing `VectorType` to include scalable vectors, as the initial 
prototype did, is the wrong polarity and makes a lot of existing code 
broken if it ever sees a scalable vector.  Your hierarchy change only 
solves this in the specific case that there’s an immediate call to 
`getNumElements()`.

Of course, if the community generally disagrees with me that this is 
necessary, I’ll accept that.  But right now I’m just hearing from 
people that are part of your project.

John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200522/12d86847/attachment.html>

Eli Friedman via llvm-dev

2020-May-22 22:34 UTC

head link

[llvm-dev] [RFC] Refactor class hierarchy of VectorType in the IR

(reply inline)

From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of John
McCall via llvm-dev
Sent: Friday, May 22, 2020 12:59 PM
To: Chris Tetreault <ctetreau at quicinc.com>
Cc: llvm-dev at lists.llvm.org
Subject: [EXT] Re: [llvm-dev] [RFC] Refactor class hierarchy of VectorType in
the IR

On 22 May 2020, at 3:15, Chris Tetreault wrote:

John,

For the last several months, those of us working on the scalable vectors feature
have been examining the codebase, identifying places where llvm::VectorType is
used incorrectly, and fixing them. The fact is that there are many places where
VectorType is correctly taken to be the generic “any vector” type.
getNumElements may be being called, but it’s being called in accordance with the
previously documented semantics. There are many places where it isn’t as well,
and many people add new usages that are incorrect.

This puts us in an unfortunate situation: if we were to take your proposal and
have VectorType be the fixed width vector type, then all of this work is undone.
Every place that has been fixed up to correctly have VectorType be used as a
universal vector type will now incorrectly have the fixed width vector type
being used as the universal vector type. Since VectorType will inherit from
BaseVectorType, it will have inherited the getElementCount(), so the compiler
will happily continue to compile this code. However, none of this code will even
work with scalable vectors because the bool will always be false. There will be
no compile time indication that this is going on, functions will just start
mysteriously returning nullptr. Earlier this afternoon, I set about seeing how
much work it would be to change the type names as you have suggested. I do not
see any way forward other than painstakingly auditing the code.

If you define getElementCount() = delete in VectorType, you can easily find the
places that are doing this and update them to use VectorBaseType. You wouldn’t
actually check that in, of course; it’s a tool for doing the audit in a way
that’s no more painstaking than what you’re already doing with getNumElements().
And in the meantime, the code that you haven’t audited — the code that’s
currently unconditionally calling getNumElements() on a VectorType — will just
conservatively not trigger on scalable vectors, which for most of LLVM is a
better result than crashing if a scalable vector comes through until your audit
gets around to updating it.

I think there are two separate aspects here, that we shouldn’t mix together:

  1.  How do we get to a consistent state in-tree, in llvm-project, where code
that requires fixed-length vectors only handles fixed-length vectors?
  2.  What names do we expose for out-of-tree users?

If we want to simply rename the types that currently exist in-tree
(VectorType/FixedVectorType/ScalableVectorType), we can do that mechanically in
a few patches; we can use a few “sed” invocations, then clang-format the result.
This would allow us to preserve the old meaning of the name “VectorType” for
out-of-tree code.  I don’t think this is particularly valuable; the names on
trunk seem fine, and out-of-tree code can equally use “sed” in the opposite
direction.

In terms of semantic changes, I see three alternatives:

  1.  We continue as we are: VectorType is the base class, and we plan to change
any code which expects a fixed-width vector to use FixedVectorType instead.
  2.  We “typedef VectorType BaseVectorType;”, go through and change all the
places where we expect a BaseVectorType, then change the meaning of VectorType
back to its original meaning of a fixed-width vector.  I think this is
problematic, though.  As this work is in progress, it would be hard to keep
track of whether code in the tree using the name VectorType means to use a
FixedVectorType, or a BaseVectorType. And the patch that actually changes the
meaning of VectorType would be a big functional change (even if it’s not
actually a big patch).
  3.  We completely kill off uses of the name “VectorType” in-tree:
incrementally rename every use of the name to either BaseVectorType or
FixedVectorType.

The last alternative is sort of more formal: it involves going through and
explicitly making a choice everywhere.  But I don’t think continuing as we are
is a problem.

Additionally, for those who disagree that the LLVM developer policy is to
disregard the needs of downstream codebases when making changes to upstream, I
submit that not throwing away months of work by everybody working to fix the
codebase to handle scalable vectors represents a real expected benefit. I
personally have been spending most of my time on this since January.

I’m responding to this as soon as I heard about it. I’ll accept that ideally I
would have seen it when you raised the RFC in March, although in practice it’s
quite hard to proactively keep up with llvmdev, and as a community I think we
really need to figure out a better process for IR design. I’m not going to feel
guilty about work you did for over a month without raising an RFC. And I really
don’t think you have in any way wasted your time; I am asking for a large but
fairly mechanical change to the code you’ve already been updating.

But most of your arguments are not based on how much work you’ve done on your
current audit, they’re based on the fact that scalable vectors were initially
implemented as a flag on VectorType. So part of my problem here is that you’re
basically arguing that, as soon as that was accepted, the generalization of
VectorType was irreversible; and that’s a real problem, because it’s very common
for early prototype work to not worry much about representations, and so they
stumble into this kind of problematic representation.

My concern is really only ~50% that this is going to force a lot of unnecessary
mechanical changes for downstream projects and 50% that generalizing VectorType
to include scalable vectors, as the initial prototype did, is the wrong polarity
and makes a lot of existing code broken if it ever sees a scalable vector. Your
hierarchy change only solves this in the specific case that there’s an immediate
call to getNumElements().

I had similar concerns about the “polarity” initially.  But we’ve found that, in
practice, that making the default “wrong” in IR optimizations has been helpful
for making progress on various aspects in parallel.  So we can take C code using
intrinsics, and produce assembly, even though we haven’t fixed all the issues in
the bits in between.

Practically speaking, almost all places that specifically need a fixed-length
type either call getNumElements(), or make some sort of query about the size of
the type.  So I don’t think there’s a big invisible tail of work even if we have
some code that’s temporarily wrong.

-Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200522/e351812d/attachment-0001.html>

Chris Tetreault via llvm-dev

2020-May-23 00:59 UTC

head link

[llvm-dev] [RFC] Refactor class hierarchy of VectorType in the IR

John,
> I’m not going to feel guilty about work you did for over a month without
raising an RFC.
I’m not sure what you mean by that, but I definitely did raise and RFC over
this. On March 9th, I raised the initial RFC. I then pinged it on April 22nd. It
was then pinged again on May 5th by another developer requesting a change to my
plan. Just last week it was requested that I reverse some changes I had made to
the C API for this work. I didn’t like those changes either, but I made them
because it wasn’t that much work, and the impact wasn’t that great, but the
people requesting them felt very strongly about the value of it. However, this
change will be a ton of work and muddies up the C++ api in the name of
stability, when the C++ api has no expectation of remaining stable.

I agree with Sander that perhaps the bi-weekly SVE meeting would be a good forum
to discuss this further. I’d like to invite any other interested parties to join
us in the meeting as well. I’d really like to settle this issue once and for
all. My work can wait until then, it’s a long weekend after all. 😊

Thanks,
   Christopher Tetreault

From: John McCall <rjmccall at apple.com>
Sent: Friday, May 22, 2020 12:59 PM
To: Chris Tetreault <ctetreau at quicinc.com>
Cc: llvm-dev at lists.llvm.org
Subject: [EXT] Re: [llvm-dev] [RFC] Refactor class hierarchy of VectorType in
the IR


On 22 May 2020, at 3:15, Chris Tetreault wrote:

John,

For the last several months, those of us working on the scalable vectors feature
have been examining the codebase, identifying places where llvm::VectorType is
used incorrectly, and fixing them. The fact is that there are many places where
VectorType is correctly taken to be the generic “any vector” type.
getNumElements may be being called, but it’s being called in accordance with the
previously documented semantics. There are many places where it isn’t as well,
and many people add new usages that are incorrect.

This puts us in an unfortunate situation: if we were to take your proposal and
have VectorType be the fixed width vector type, then all of this work is undone.
Every place that has been fixed up to correctly have VectorType be used as a
universal vector type will now incorrectly have the fixed width vector type
being used as the universal vector type. Since VectorType will inherit from
BaseVectorType, it will have inherited the getElementCount(), so the compiler
will happily continue to compile this code. However, none of this code will even
work with scalable vectors because the bool will always be false. There will be
no compile time indication that this is going on, functions will just start
mysteriously returning nullptr. Earlier this afternoon, I set about seeing how
much work it would be to change the type names as you have suggested. I do not
see any way forward other than painstakingly auditing the code.

If you define getElementCount() = delete in VectorType, you can easily find the
places that are doing this and update them to use VectorBaseType. You wouldn’t
actually check that in, of course; it’s a tool for doing the audit in a way
that’s no more painstaking than what you’re already doing with getNumElements().
And in the meantime, the code that you haven’t audited — the code that’s
currently unconditionally calling getNumElements() on a VectorType — will just
conservatively not trigger on scalable vectors, which for most of LLVM is a
better result than crashing if a scalable vector comes through until your audit
gets around to updating it.

Additionally, for those who disagree that the LLVM developer policy is to
disregard the needs of downstream codebases when making changes to upstream, I
submit that not throwing away months of work by everybody working to fix the
codebase to handle scalable vectors represents a real expected benefit. I
personally have been spending most of my time on this since January.

I’m responding to this as soon as I heard about it. I’ll accept that ideally I
would have seen it when you raised the RFC in March, although in practice it’s
quite hard to proactively keep up with llvmdev, and as a community I think we
really need to figure out a better process for IR design. I’m not going to feel
guilty about work you did for over a month without raising an RFC. And I really
don’t think you have in any way wasted your time; I am asking for a large but
fairly mechanical change to the code you’ve already been updating.

But most of your arguments are not based on how much work you’ve done on your
current audit, they’re based on the fact that scalable vectors were initially
implemented as a flag on VectorType. So part of my problem here is that you’re
basically arguing that, as soon as that was accepted, the generalization of
VectorType was irreversible; and that’s a real problem, because it’s very common
for early prototype work to not worry much about representations, and so they
stumble into this kind of problematic representation.

My concern is really only ~50% that this is going to force a lot of unnecessary
mechanical changes for downstream projects and 50% that generalizing VectorType
to include scalable vectors, as the initial prototype did, is the wrong polarity
and makes a lot of existing code broken if it ever sees a scalable vector. Your
hierarchy change only solves this in the specific case that there’s an immediate
call to getNumElements().

Of course, if the community generally disagrees with me that this is necessary,
I’ll accept that. But right now I’m just hearing from people that are part of
your project.

John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200523/9b5ac2d2/attachment-0001.html>

llvm dev - May 2020 - [RFC] Refactor class hierarchy of VectorType in the IR

[llvm-dev] [RFC] Refactor class hierarchy of VectorType in the IR

[llvm-dev] [RFC] Refactor class hierarchy of VectorType in the IR

[llvm-dev] [RFC] Refactor class hierarchy of VectorType in the IR

[llvm-dev] [RFC] Refactor class hierarchy of VectorType in the IR