thr3ads.net - llvm dev - [LLVMdev] Stricter adherence to coding standards in LLVM? [Feb 2011]

If this information is useful, please help other people find it:
Share via:

Talin

2011-Feb-12 00:17 UTC

[LLVMdev] Stricter adherence to coding standards in LLVM?

I notice that there's a lot of code in LLVM core libraries that is wildly
inconsistent in formatting. Two particular things come to mind:

-- Capitalization of method names. The LLVM guidelines say one thing, but a
lot of recent code (such as IRBuilder) consistently does the opposite of
what the guidelines say. One has to wonder which source of truth is more
authoritative - after all, the docs could always be out of date.

-- Whitespace around * and & in variable declarations. I see some files that
consistently put the space after, some files put it before, while some files
use one convention for parameters, and a different convention for member
variables and local variables. Within a single file, the usage is pretty
consistent, so you know that the people who wrote the code were really
trying to adhere to a standard.

Now, common wisdom for open source projects is that inconsistent legacy
formatting is left alone, and new code adheres to whatever style is used in
nearby code. Otherwise it creates difficulty for people who want to compare
revisions in the source control. Occasionally there is a need to diff
revisions that are widely spaced in time, and a global reformatting makes
this hard if the reformatting change falls somewhere within the timespan of
interest. The downside of this approach is that you're stuck with the
inconsistent formatting forever.

In my own open source projects, I take the opposite approach, which is to
nip any inconsistencies in the bud as early and aggressively as possible, so
that they won't cause problems later on. A person who needs to compare two
revisions is far more likely to be interested in recent revisions than
really old ones, so it's better to do large changes sooner rather than later
- the farther in the past we can push back the reformatting the better.

There is also the problem of backwards compatibility with existing users of
the code. Obviously it's bad form to break user code for trivial stylistic
reasons. However, if there is some other API change with broad impact that's
going in, it might make sense to fix the names at the same time, given that
folks are going to have to update their code anyway. (Although that's
unlikely with a codebase as mature as LLVM). Another approach is to
deprecate the old API for a given class entirely, and replace it with some
newer, shinier version that just happens to conform to the letter of the
conventions, so that people can continue to use the old class until they get
around to upgrading.

In the case of whitespace around * and &, that is semantically neutral, and
the impact on the diffs are small, so a global fix to these would incur less
pain overall. The main danger is that the discussion of proper style is
likely to turn into a major religious war, making it not worth the effort to
even raise the topic.

-- 
-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110211/f2e45dda/attachment.html>

Chris Lattner

2011-Feb-14 19:04 UTC

head link

[LLVMdev] Stricter adherence to coding standards in LLVM?

On Feb 11, 2011, at 4:17 PM, Talin wrote:
> I notice that there's a lot of code in LLVM core libraries that is
wildly inconsistent in formatting. Two particular things come to mind:
> 
> -- Capitalization of method names. The LLVM guidelines say one thing, but a
lot of recent code (such as IRBuilder) consistently does the opposite of what
the guidelines say. One has to wonder which source of truth is more
authoritative - after all, the docs could always be out of date.
IRBuilder is recent, but the naming conventions are even more so.
> -- Whitespace around * and & in variable declarations. I see some files
that consistently put the space after, some files put it before, while some
files use one convention for parameters, and a different convention for member
variables and local variables. Within a single file, the usage is pretty
consistent, so you know that the people who wrote the code were really trying to
adhere to a standard.
Yes, this is annoying.
> Now, common wisdom for open source projects is that inconsistent legacy
formatting is left alone, and new code adheres to whatever style is used in
nearby code.
...>  nip any inconsistencies in the bud as early and aggressively as possible
..> There is also the problem of backwards compatibility with existing users of
the code.
I'm not worried about API stability, this is something we don't
guarantee.

The approach I'd like to see is gentle, gradual migration.  As new code is
written, it should follow the standards, as existing code is modified, it should
be updated to match the standards.  This is what I do when working on the
codebase, and would appreciate it if others would as well.  It's best to not
conflate coding standards changes in with other semantic changes though.

Eventually, it would be great for someone to make a clang mode to enforce the
standards for the project.  If we had that, we could do a post-commit hook or
something.

-Chris

Apparently Analagous Threads

Search for more possibly parallel threads

llvm dev - Feb 2011 - [LLVMdev] Stricter adherence to coding standards in LLVM?

[LLVMdev] Stricter adherence to coding standards in LLVM?

[LLVMdev] Stricter adherence to coding standards in LLVM?

Apparently Analagous Threads