thr3ads.net - llvm dev - [llvm-dev] RFC: changing variable naming rules in LLVM codebase [Feb 2019]

If this information is useful, please help other people find it:
Share via:

Chris Lattner via llvm-dev

2019-Feb-20 17:07 UTC

[llvm-dev] RFC: changing variable naming rules in LLVM codebase

> On Feb 19, 2019, at 7:43 AM, Alex Bradbury via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> On Tue, 19 Feb 2019 at 15:24, Zachary Turner <zturner at google.com>
wrote:
>> On Mon, Feb 18, 2019 at 2:16 AM Michael Platings via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>>> 
>>> Regarding a plan for conversion, I'm keen to avoid perfect
being the enemy of better.
>> 
>> 
>> It seems a bit early to discuss conversion given there’s not consensus
on a style.  For example:
> 
> I see it a bit differently. The first question is "should we change
> the LLVM naming conventions". I view the plan for conversion as
> essential to answering this question - IMHO if we're going to live
> with mixed styles for years (which would be the case if there were no
> concerted conversion effort) then the advantages of changing naming
> convention are outweighed by the disadvantages by quite some way. So
> while I appreciate the desire to separate concerns, I find it
> difficult to do so in this case.
I absolutely respect that position and concern, but there are other factors to
consider.  Let’s assume that we all agreed that a change was the right thing to
do, and we were only concerned about the transition (just to simplify the
discussion):

1) Transition cost can be used to argue against *any* global improvement to an
existing codebase.  The GCC community used this argument for many years arguing
against moving to C++ from C.  It is true that there will be a time of
transition, but at some point in time, you just decide that the cost of a global
mechanical change is worth it and you get to consistency.

2) Tremendous amounts of new code is being written, and lots of existing code is
being touched.  This provides opportunities to incrementally migrate.

3) The cost of inconsistency is low, because the affected declarations typically
have local scope.  Even the affected globals in LLVM are typically static within
a file.  This means that you don’t need a global index to go what is going on. 
Variables and functions are also often named quite different (e.g. functions
often use imperative verbs).

4) The LLVM community and project is growing through new subprojects, new
targets, and other new things, and holding back the “correct” thing in new
subprojects because of legacy code in other parts of the project doesn’t make
sense.

That said, I understand that there is still controversy about whether making a
change would improve the project, I just wanted to point out that if we converge
on that, then we should consider what LLVM looks like 10 years from now, not
just what it looks like 6 months from now.  I for one don’t want to see LLVM
stagnate, slow, and suffer because of legacy concerns.

-Chris

Alex Bradbury via llvm-dev

2019-Feb-21 14:27 UTC

head link

[llvm-dev] RFC: changing variable naming rules in LLVM codebase

On Wed, 20 Feb 2019 at 17:07, Chris Lattner <clattner at nondot.org>
wrote:> > On Feb 19, 2019, at 7:43 AM, Alex Bradbury via llvm-dev <llvm-dev
at lists.llvm.org> wrote:
> >
> > On Tue, 19 Feb 2019 at 15:24, Zachary Turner <zturner at
google.com> wrote:
> >> On Mon, Feb 18, 2019 at 2:16 AM Michael Platings via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> >>>
> >>> Regarding a plan for conversion, I'm keen to avoid perfect
being the enemy of better.
> >>
> >>
> >> It seems a bit early to discuss conversion given there’s not
consensus on a style.  For example:
> >
> > I see it a bit differently. The first question is "should we
change
> > the LLVM naming conventions". I view the plan for conversion as
> > essential to answering this question - IMHO if we're going to live
> > with mixed styles for years (which would be the case if there were no
> > concerted conversion effort) then the advantages of changing naming
> > convention are outweighed by the disadvantages by quite some way. So
> > while I appreciate the desire to separate concerns, I find it
> > difficult to do so in this case.
>
> I absolutely respect that position and concern, but there are other factors
to consider.  Let’s assume that we all agreed that a change was the right thing
to do, and we were only concerned about the transition (just to simplify the
discussion):
>
> 1) Transition cost can be used to argue against *any* global improvement to
an existing codebase.  The GCC community used this argument for many years
arguing against moving to C++ from C.  It is true that there will be a time of
transition, but at some point in time, you just decide that the cost of a global
mechanical change is worth it and you get to consistency.
>
> 2) Tremendous amounts of new code is being written, and lots of existing
code is being touched.  This provides opportunities to incrementally migrate.
>
> 3) The cost of inconsistency is low, because the affected declarations
typically have local scope.  Even the affected globals in LLVM are typically
static within a file.  This means that you don’t need a global index to go what
is going on.  Variables and functions are also often named quite different (e.g.
functions often use imperative verbs).
>
> 4) The LLVM community and project is growing through new subprojects, new
targets, and other new things, and holding back the “correct” thing in new
subprojects because of legacy code in other parts of the project doesn’t make
sense.
>
>
> That said, I understand that there is still controversy about whether
making a change would improve the project, I just wanted to point out that if we
converge on that, then we should consider what LLVM looks like 10 years from
now, not just what it looks like 6 months from now.  I for one don’t want to see
LLVM stagnate, slow, and suffer because of legacy concerns.
Thanks for elaborating on your thinking Chris. The point about new
subprojects etc is a particularly strong one.

It wasn't clear from my previous email, but my general inclination is
to pursue global improvements such as this, and to combine them with a
somewhat aggressive transition plan in order for the benefits to be
reaped sooner (and to minimise the cost of inconsistency).

Best,

Alex

via llvm-dev

2019-Feb-22 15:59 UTC

head link

[llvm-dev] RFC: changing variable naming rules in LLVM codebase

I had posted something in the code review but Chris suggested doing it 
here instead, which makes sense. Also I have to remember that the
discussion is specifically about spelling variables, not changing any
other spelling conventions.

Looking at names of "variables" there's reasonable support for
making
them visually more distinct from other kinds of names.  Regarding 
making data-member names distinct from local variables, some people 
don't see why it should matter, others find it helpful; given this 
neutral-to-helpful spectrum, going with the kind-of helpful convention 
seems preferable.

So:

- Local variables and formal parameters should be lower_case, with
  one exception: Variables/parameters that have lambda/function
  type should follow the function-name spelling rules.
- Class data members should have an "m_" prefix, so m_lower_case.
- Initialisms and other abbreviations would be considered words for 
  this purpose, so we have names such as:
    tli   // Local variable for TargetLoweringInfo
    m_cgm // Data member for CodeGenModule
- I don't have a good suggestion for file-static/global variables.
  Some people have suggested a "g_" prefix for globals, or possibly
  an "s_" prefix for class-static data.

Regarding the transition:

Some people have worried that the churn will cause blame issues.
I respectfully point out that in my own archaeology I have to deal
with lots of clang-format/indentation/other random semantically
meaningless refactoring, this is just one more.  Also the point is
not to optimize for git-blame but to optimize for reading what is
there at the moment.

A more focused and shorter transition period will create a lot of
short-term churn but get us to the good endpoint sooner.  Doing
conversions per-file or per-class (rather than per-function [too
small] or per-library [too big]) are probably the way to go.  
Given we are changing the names used for _data_, and we try to 
practice good data-hiding, the impact of the conversion of any 
given class *ought* to be reasonably confined.

If someone can make clang-tidy help with this, that's awesome.

I'm almost afraid to make the next suggestion, but here goes:
In more complicated/wide-impact cases, it would be possible to
stage a data-member name conversion into "small-bang" iterations
using a C++ tactic like this:
    class Foo {
      int m_bar; // The renamed member.
      int &Bar = m_bar; // TEMPORARY alias using the old name.
    };
This would have to be done sparingly and for good reason, such as
when the names are known across many components/subprojects and 
doing them all at once would be really too much. Someone would
have to commit to getting it all done and removing the aliases in
a reasonably short period of time.  Needing to do this trick would
be (IMO) strong evidence of poor software design and a place to
focus some refactoring effort.

HTH,
--paulr

Apparently Analagous Threads

Search for more possibly parallel threads

llvm dev - Feb 2019 - RFC: changing variable naming rules in LLVM codebase

[llvm-dev] RFC: changing variable naming rules in LLVM codebase

[llvm-dev] RFC: changing variable naming rules in LLVM codebase

[llvm-dev] RFC: changing variable naming rules in LLVM codebase

Apparently Analagous Threads