thr3ads.net - llvm dev - [llvm-dev] Widescale clang-tidy (or similar) based cleanup [Apr 2017]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2017-Apr-07 22:05 UTC

[llvm-dev] Widescale clang-tidy (or similar) based cleanup

There have been some efforts recently to use clang-tidy or similar
automated refactoring to make project-wide cleanups/improvements to the
LLVM codebase that appear to me to be a bit out of character with the sort
of changes & philosophies that have been applied in the past.

I think there are a few issues at hand which I'll try to summarize:

* Churn/blame-convenience:
Previously, large scale changes (classically whitespace cleanup, with
things like clang-format) have been rejected or discouraged due to the risk
of making it harder to find the original source of a semantic change.
I'm not too wedded to this issue myself, but it did seem to be the
status-quo previously so I'm curious to better understand if/how people see
this applying or not, to more semantic changes.

* Efficiency:
Sending individual or even batched reviews for automated changes like this
seems inefficient. I'd rather see, for example, an email to llvm-dev to
discuss whether a particular change makes sense to make across the project
(ie: does the LLVM project want to cleanup all instances of classic for to
range-based-for when clang-tidy can do so?) and then not send the
individual changes for review, but commit them directly where possible. (if
the person doing this automated cleanup doesn't have permission, yeah, send
it out and reference the original discussion thread).

Some points of comparison: sometimes there's similar cleanup done in a
non-automated fashion as Mehdi pointed out in one of the threads I brought
this up (see this thread:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170403/443547.html
).
Usually some amount of cleanup has been acceptable if the code was
generally being churned anyway (eg: clang-formatting a file so it's
consistent, before doing major surgery to it anyway, so the surgical
changes don't create formatting inconsistencies), or as a result of a new
API change (add a range-based accessor then fix up existing call sites to
use range-based-for). I'd also not be surprised by a whitespace cleanup
shortly after new code was committed - and have done this myself "oops,
forgot to format my commit, here's a new commit that does the
formatting".
That seems different to me from these wider-scale cleanups.

I think I'm personally mostly in favor of this sort of stuff (though I
think I would like some community buy-in to sign off project-wide on each
clang-tidy rule/pattern/etc that's going to be applied) but it does seem
new/different from the way some things have been done in the past so I'm
curious how other people think about the differences/similarities/guiding
principles.

- David
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170407/5b6b8fed/attachment.html>

Adrian Prantl via llvm-dev

2017-Apr-07 22:14 UTC

head link

[llvm-dev] Widescale clang-tidy (or similar) based cleanup

> On Apr 7, 2017, at 3:05 PM, David Blaikie via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> There have been some efforts recently to use clang-tidy or similar
automated refactoring to make project-wide cleanups/improvements to the LLVM
codebase that appear to me to be a bit out of character with the sort of changes
& philosophies that have been applied in the past.
> 
> I think there are a few issues at hand which I'll try to summarize:
> 
> * Churn/blame-convenience:
> Previously, large scale changes (classically whitespace cleanup, with
things like clang-format) have been rejected or discouraged due to the risk of
making it harder to find the original source of a semantic change.
> I'm not too wedded to this issue myself, but it did seem to be the
status-quo previously so I'm curious to better understand if/how people see
this applying or not, to more semantic changes.
Personally I think that this is a shortcoming of one's tooling if this is a
problem. Many git-blame viewers allow you to roll back to the previous revision
at the current line at one keypress.
> 
> * Efficiency:
> Sending individual or even batched reviews for automated changes like this
seems inefficient. I'd rather see, for example, an email to llvm-dev to
discuss whether a particular change makes sense to make across the project (ie:
does the LLVM project want to cleanup all instances of classic for to
range-based-for when clang-tidy can do so?) and then not send the individual
changes for review, but commit them directly where possible. (if the person
doing this automated cleanup doesn't have permission, yeah, send it out and
reference the original discussion thread).
I think it makes sense to send out a phabricator review per kind of clang-tidy
change, i.e., one patch that performs the same kind of change all over the
project. Having only one kind of transformation per review will make it really
easy to skim over the patch and detect situations where clang-tidy changes the
code for the worse.

-- adrian> 
> Some points of comparison: sometimes there's similar cleanup done in a
non-automated fashion as Mehdi pointed out in one of the threads I brought this
up (see this thread:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170403/443547.html ).
Usually some amount of cleanup has been acceptable if the code was generally
being churned anyway (eg: clang-formatting a file so it's consistent, before
doing major surgery to it anyway, so the surgical changes don't create
formatting inconsistencies), or as a result of a new API change (add a
range-based accessor then fix up existing call sites to use range-based-for).
I'd also not be surprised by a whitespace cleanup shortly after new code was
committed - and have done this myself "oops, forgot to format my commit,
here's a new commit that does the formatting". That seems different to
me from these wider-scale cleanups.
> 
> I think I'm personally mostly in favor of this sort of stuff (though I
think I would like some community buy-in to sign off project-wide on each
clang-tidy rule/pattern/etc that's going to be applied) but it does seem
new/different from the way some things have been done in the past so I'm
curious how other people think about the differences/similarities/guiding
principles.
> 
> - David
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Reid Kleckner via llvm-dev

2017-Apr-07 22:33 UTC

head link

[llvm-dev] Widescale clang-tidy (or similar) based cleanup

I'd add that cleanups can create conflicts with pending patches, which
isn't major, but raises the bar for global cleanups slightly.

Cleaning up code before modifying it is totally reasonable because you're
already going to change it. In effect you're taking some ownership for it.
If problems arise as a result of your cleanups or functional changes,
you're on the hook to fix it. It also usually reduces the patch size of the
eventual functional change, which makes it easier to understand the patch
during code review and in the future during source code archaeology.

I think we should basically do what you suggest: if people want to make
global pattern-based cleanups (push_back -> emplace_back, range-based-for,
etc), we should just raise it on llvm-dev and make a decision about the
value of the cleanup.

On Fri, Apr 7, 2017 at 3:05 PM, David Blaikie via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> There have been some efforts recently to use clang-tidy or similar
> automated refactoring to make project-wide cleanups/improvements to the
> LLVM codebase that appear to me to be a bit out of character with the sort
> of changes & philosophies that have been applied in the past.
>
> I think there are a few issues at hand which I'll try to summarize:
>
> * Churn/blame-convenience:
> Previously, large scale changes (classically whitespace cleanup, with
> things like clang-format) have been rejected or discouraged due to the risk
> of making it harder to find the original source of a semantic change.
> I'm not too wedded to this issue myself, but it did seem to be the
> status-quo previously so I'm curious to better understand if/how people
see
> this applying or not, to more semantic changes.
>
> * Efficiency:
> Sending individual or even batched reviews for automated changes like this
> seems inefficient. I'd rather see, for example, an email to llvm-dev to
> discuss whether a particular change makes sense to make across the project
> (ie: does the LLVM project want to cleanup all instances of classic for to
> range-based-for when clang-tidy can do so?) and then not send the
> individual changes for review, but commit them directly where possible. (if
> the person doing this automated cleanup doesn't have permission, yeah,
send
> it out and reference the original discussion thread).
>
> Some points of comparison: sometimes there's similar cleanup done in a
> non-automated fashion as Mehdi pointed out in one of the threads I brought
> this up (see this thread: http://lists.llvm.org/
> pipermail/llvm-commits/Week-of-Mon-20170403/443547.html ). Usually some
> amount of cleanup has been acceptable if the code was generally being
> churned anyway (eg: clang-formatting a file so it's consistent, before
> doing major surgery to it anyway, so the surgical changes don't create
> formatting inconsistencies), or as a result of a new API change (add a
> range-based accessor then fix up existing call sites to use
> range-based-for). I'd also not be surprised by a whitespace cleanup
shortly
> after new code was committed - and have done this myself "oops, forgot
to
> format my commit, here's a new commit that does the formatting".
That seems
> different to me from these wider-scale cleanups.
>
> I think I'm personally mostly in favor of this sort of stuff (though I
> think I would like some community buy-in to sign off project-wide on each
> clang-tidy rule/pattern/etc that's going to be applied) but it does
seem
> new/different from the way some things have been done in the past so
I'm
> curious how other people think about the differences/similarities/guiding
> principles.
>
> - David
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170407/b410be63/attachment.html>

David Blaikie via llvm-dev

2017-Apr-07 22:49 UTC

head link

[llvm-dev] Widescale clang-tidy (or similar) based cleanup

On Fri, Apr 7, 2017 at 3:14 PM Adrian Prantl <aprantl at apple.com> wrote:
>
> > On Apr 7, 2017, at 3:05 PM, David Blaikie via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> > There have been some efforts recently to use clang-tidy or similar
> automated refactoring to make project-wide cleanups/improvements to the
> LLVM codebase that appear to me to be a bit out of character with the sort
> of changes & philosophies that have been applied in the past.
> >
> > I think there are a few issues at hand which I'll try to
summarize:
> >
> > * Churn/blame-convenience:
> > Previously, large scale changes (classically whitespace cleanup, with
> things like clang-format) have been rejected or discouraged due to the risk
> of making it harder to find the original source of a semantic change.
> > I'm not too wedded to this issue myself, but it did seem to be the
> status-quo previously so I'm curious to better understand if/how people
see
> this applying or not, to more semantic changes.
>
> Personally I think that this is a shortcoming of one's tooling if this
is
> a problem. Many git-blame viewers allow you to roll back to the previous
> revision at the current line at one keypress.
>
> >
> > * Efficiency:
> > Sending individual or even batched reviews for automated changes like
> this seems inefficient. I'd rather see, for example, an email to
llvm-dev
> to discuss whether a particular change makes sense to make across the
> project (ie: does the LLVM project want to cleanup all instances of classic
> for to range-based-for when clang-tidy can do so?) and then not send the
> individual changes for review, but commit them directly where possible. (if
> the person doing this automated cleanup doesn't have permission, yeah,
send
> it out and reference the original discussion thread).
>
> I think it makes sense to send out a phabricator review per kind of
> clang-tidy change, i.e., one patch that performs the same kind of change
> all over the project. Having only one kind of transformation per review
> will make it really easy to skim over the patch and detect situations where
> clang-tidy changes the code for the worse.
>
Reckon it's worth elevating these reviews to llvm-dev rather than only
llvm-commits? (in effect what I'd expect to see is a "hey, thinking
about
applying this pattern-based* cleanup - here are some examples of what it
does, the corner cases, cases it doesn't handle, etc (and, as an aside, the
attached patch shows the changes created by applying this tool to all of
LLVM (or LLVM+Clang, etc))")

Anyone have opinions on whether llvm-dev would be a sufficient medium to
approve automated cleanup of other subprojects? (Clang? compiler-rt? lld?
lldb?). Or would it be necessary to have a discussion aobut each
subproject? (roughly I feel like LLVM+Clang are close enough together that
a single approval would suffice there at least - but maybe Clang devs**
have other ideas... )

*thanks, Reid, for that choice of terminology

** Pure Clang devs notably not included in this email thread...

>
> -- adrian
> >
> > Some points of comparison: sometimes there's similar cleanup done
in a
> non-automated fashion as Mehdi pointed out in one of the threads I brought
> this up (see this thread:
>
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170403/443547.html
> ). Usually some amount of cleanup has been acceptable if the code was
> generally being churned anyway (eg: clang-formatting a file so it's
> consistent, before doing major surgery to it anyway, so the surgical
> changes don't create formatting inconsistencies), or as a result of a
new
> API change (add a range-based accessor then fix up existing call sites to
> use range-based-for). I'd also not be surprised by a whitespace cleanup
> shortly after new code was committed - and have done this myself
"oops,
> forgot to format my commit, here's a new commit that does the
formatting".
> That seems different to me from these wider-scale cleanups.
> >
> > I think I'm personally mostly in favor of this sort of stuff
(though I
> think I would like some community buy-in to sign off project-wide on each
> clang-tidy rule/pattern/etc that's going to be applied) but it does
seem
> new/different from the way some things have been done in the past so
I'm
> curious how other people think about the differences/similarities/guiding
> principles.
> >
> > - David
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170407/2f44bf09/attachment.html>

Possibly Parallel Threads

Search for more maybe matching threads

llvm dev - Apr 2017 - Widescale clang-tidy (or similar) based cleanup

[llvm-dev] Widescale clang-tidy (or similar) based cleanup

[llvm-dev] Widescale clang-tidy (or similar) based cleanup

[llvm-dev] Widescale clang-tidy (or similar) based cleanup

[llvm-dev] Widescale clang-tidy (or similar) based cleanup

Possibly Parallel Threads