David Blaikie via llvm-dev
2017-Apr-07 22:05 UTC
[llvm-dev] Widescale clang-tidy (or similar) based cleanup
There have been some efforts recently to use clang-tidy or similar automated refactoring to make project-wide cleanups/improvements to the LLVM codebase that appear to me to be a bit out of character with the sort of changes & philosophies that have been applied in the past. I think there are a few issues at hand which I'll try to summarize: * Churn/blame-convenience: Previously, large scale changes (classically whitespace cleanup, with things like clang-format) have been rejected or discouraged due to the risk of making it harder to find the original source of a semantic change. I'm not too wedded to this issue myself, but it did seem to be the status-quo previously so I'm curious to better understand if/how people see this applying or not, to more semantic changes. * Efficiency: Sending individual or even batched reviews for automated changes like this seems inefficient. I'd rather see, for example, an email to llvm-dev to discuss whether a particular change makes sense to make across the project (ie: does the LLVM project want to cleanup all instances of classic for to range-based-for when clang-tidy can do so?) and then not send the individual changes for review, but commit them directly where possible. (if the person doing this automated cleanup doesn't have permission, yeah, send it out and reference the original discussion thread). Some points of comparison: sometimes there's similar cleanup done in a non-automated fashion as Mehdi pointed out in one of the threads I brought this up (see this thread: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170403/443547.html ). Usually some amount of cleanup has been acceptable if the code was generally being churned anyway (eg: clang-formatting a file so it's consistent, before doing major surgery to it anyway, so the surgical changes don't create formatting inconsistencies), or as a result of a new API change (add a range-based accessor then fix up existing call sites to use range-based-for). I'd also not be surprised by a whitespace cleanup shortly after new code was committed - and have done this myself "oops, forgot to format my commit, here's a new commit that does the formatting". That seems different to me from these wider-scale cleanups. I think I'm personally mostly in favor of this sort of stuff (though I think I would like some community buy-in to sign off project-wide on each clang-tidy rule/pattern/etc that's going to be applied) but it does seem new/different from the way some things have been done in the past so I'm curious how other people think about the differences/similarities/guiding principles. - David -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170407/5b6b8fed/attachment.html>
Adrian Prantl via llvm-dev
2017-Apr-07 22:14 UTC
[llvm-dev] Widescale clang-tidy (or similar) based cleanup
> On Apr 7, 2017, at 3:05 PM, David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > There have been some efforts recently to use clang-tidy or similar automated refactoring to make project-wide cleanups/improvements to the LLVM codebase that appear to me to be a bit out of character with the sort of changes & philosophies that have been applied in the past. > > I think there are a few issues at hand which I'll try to summarize: > > * Churn/blame-convenience: > Previously, large scale changes (classically whitespace cleanup, with things like clang-format) have been rejected or discouraged due to the risk of making it harder to find the original source of a semantic change. > I'm not too wedded to this issue myself, but it did seem to be the status-quo previously so I'm curious to better understand if/how people see this applying or not, to more semantic changes.Personally I think that this is a shortcoming of one's tooling if this is a problem. Many git-blame viewers allow you to roll back to the previous revision at the current line at one keypress.> > * Efficiency: > Sending individual or even batched reviews for automated changes like this seems inefficient. I'd rather see, for example, an email to llvm-dev to discuss whether a particular change makes sense to make across the project (ie: does the LLVM project want to cleanup all instances of classic for to range-based-for when clang-tidy can do so?) and then not send the individual changes for review, but commit them directly where possible. (if the person doing this automated cleanup doesn't have permission, yeah, send it out and reference the original discussion thread).I think it makes sense to send out a phabricator review per kind of clang-tidy change, i.e., one patch that performs the same kind of change all over the project. Having only one kind of transformation per review will make it really easy to skim over the patch and detect situations where clang-tidy changes the code for the worse. -- adrian> > Some points of comparison: sometimes there's similar cleanup done in a non-automated fashion as Mehdi pointed out in one of the threads I brought this up (see this thread: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170403/443547.html ). Usually some amount of cleanup has been acceptable if the code was generally being churned anyway (eg: clang-formatting a file so it's consistent, before doing major surgery to it anyway, so the surgical changes don't create formatting inconsistencies), or as a result of a new API change (add a range-based accessor then fix up existing call sites to use range-based-for). I'd also not be surprised by a whitespace cleanup shortly after new code was committed - and have done this myself "oops, forgot to format my commit, here's a new commit that does the formatting". That seems different to me from these wider-scale cleanups. > > I think I'm personally mostly in favor of this sort of stuff (though I think I would like some community buy-in to sign off project-wide on each clang-tidy rule/pattern/etc that's going to be applied) but it does seem new/different from the way some things have been done in the past so I'm curious how other people think about the differences/similarities/guiding principles. > > - David > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reid Kleckner via llvm-dev
2017-Apr-07 22:33 UTC
[llvm-dev] Widescale clang-tidy (or similar) based cleanup
I'd add that cleanups can create conflicts with pending patches, which isn't major, but raises the bar for global cleanups slightly. Cleaning up code before modifying it is totally reasonable because you're already going to change it. In effect you're taking some ownership for it. If problems arise as a result of your cleanups or functional changes, you're on the hook to fix it. It also usually reduces the patch size of the eventual functional change, which makes it easier to understand the patch during code review and in the future during source code archaeology. I think we should basically do what you suggest: if people want to make global pattern-based cleanups (push_back -> emplace_back, range-based-for, etc), we should just raise it on llvm-dev and make a decision about the value of the cleanup. On Fri, Apr 7, 2017 at 3:05 PM, David Blaikie via llvm-dev < llvm-dev at lists.llvm.org> wrote:> There have been some efforts recently to use clang-tidy or similar > automated refactoring to make project-wide cleanups/improvements to the > LLVM codebase that appear to me to be a bit out of character with the sort > of changes & philosophies that have been applied in the past. > > I think there are a few issues at hand which I'll try to summarize: > > * Churn/blame-convenience: > Previously, large scale changes (classically whitespace cleanup, with > things like clang-format) have been rejected or discouraged due to the risk > of making it harder to find the original source of a semantic change. > I'm not too wedded to this issue myself, but it did seem to be the > status-quo previously so I'm curious to better understand if/how people see > this applying or not, to more semantic changes. > > * Efficiency: > Sending individual or even batched reviews for automated changes like this > seems inefficient. I'd rather see, for example, an email to llvm-dev to > discuss whether a particular change makes sense to make across the project > (ie: does the LLVM project want to cleanup all instances of classic for to > range-based-for when clang-tidy can do so?) and then not send the > individual changes for review, but commit them directly where possible. (if > the person doing this automated cleanup doesn't have permission, yeah, send > it out and reference the original discussion thread). > > Some points of comparison: sometimes there's similar cleanup done in a > non-automated fashion as Mehdi pointed out in one of the threads I brought > this up (see this thread: http://lists.llvm.org/ > pipermail/llvm-commits/Week-of-Mon-20170403/443547.html ). Usually some > amount of cleanup has been acceptable if the code was generally being > churned anyway (eg: clang-formatting a file so it's consistent, before > doing major surgery to it anyway, so the surgical changes don't create > formatting inconsistencies), or as a result of a new API change (add a > range-based accessor then fix up existing call sites to use > range-based-for). I'd also not be surprised by a whitespace cleanup shortly > after new code was committed - and have done this myself "oops, forgot to > format my commit, here's a new commit that does the formatting". That seems > different to me from these wider-scale cleanups. > > I think I'm personally mostly in favor of this sort of stuff (though I > think I would like some community buy-in to sign off project-wide on each > clang-tidy rule/pattern/etc that's going to be applied) but it does seem > new/different from the way some things have been done in the past so I'm > curious how other people think about the differences/similarities/guiding > principles. > > - David > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170407/b410be63/attachment.html>
David Blaikie via llvm-dev
2017-Apr-07 22:49 UTC
[llvm-dev] Widescale clang-tidy (or similar) based cleanup
On Fri, Apr 7, 2017 at 3:14 PM Adrian Prantl <aprantl at apple.com> wrote:> > > On Apr 7, 2017, at 3:05 PM, David Blaikie via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > There have been some efforts recently to use clang-tidy or similar > automated refactoring to make project-wide cleanups/improvements to the > LLVM codebase that appear to me to be a bit out of character with the sort > of changes & philosophies that have been applied in the past. > > > > I think there are a few issues at hand which I'll try to summarize: > > > > * Churn/blame-convenience: > > Previously, large scale changes (classically whitespace cleanup, with > things like clang-format) have been rejected or discouraged due to the risk > of making it harder to find the original source of a semantic change. > > I'm not too wedded to this issue myself, but it did seem to be the > status-quo previously so I'm curious to better understand if/how people see > this applying or not, to more semantic changes. > > Personally I think that this is a shortcoming of one's tooling if this is > a problem. Many git-blame viewers allow you to roll back to the previous > revision at the current line at one keypress. > > > > > * Efficiency: > > Sending individual or even batched reviews for automated changes like > this seems inefficient. I'd rather see, for example, an email to llvm-dev > to discuss whether a particular change makes sense to make across the > project (ie: does the LLVM project want to cleanup all instances of classic > for to range-based-for when clang-tidy can do so?) and then not send the > individual changes for review, but commit them directly where possible. (if > the person doing this automated cleanup doesn't have permission, yeah, send > it out and reference the original discussion thread). > > I think it makes sense to send out a phabricator review per kind of > clang-tidy change, i.e., one patch that performs the same kind of change > all over the project. Having only one kind of transformation per review > will make it really easy to skim over the patch and detect situations where > clang-tidy changes the code for the worse. >Reckon it's worth elevating these reviews to llvm-dev rather than only llvm-commits? (in effect what I'd expect to see is a "hey, thinking about applying this pattern-based* cleanup - here are some examples of what it does, the corner cases, cases it doesn't handle, etc (and, as an aside, the attached patch shows the changes created by applying this tool to all of LLVM (or LLVM+Clang, etc))") Anyone have opinions on whether llvm-dev would be a sufficient medium to approve automated cleanup of other subprojects? (Clang? compiler-rt? lld? lldb?). Or would it be necessary to have a discussion aobut each subproject? (roughly I feel like LLVM+Clang are close enough together that a single approval would suffice there at least - but maybe Clang devs** have other ideas... ) *thanks, Reid, for that choice of terminology ** Pure Clang devs notably not included in this email thread...> > -- adrian > > > > Some points of comparison: sometimes there's similar cleanup done in a > non-automated fashion as Mehdi pointed out in one of the threads I brought > this up (see this thread: > http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170403/443547.html > ). Usually some amount of cleanup has been acceptable if the code was > generally being churned anyway (eg: clang-formatting a file so it's > consistent, before doing major surgery to it anyway, so the surgical > changes don't create formatting inconsistencies), or as a result of a new > API change (add a range-based accessor then fix up existing call sites to > use range-based-for). I'd also not be surprised by a whitespace cleanup > shortly after new code was committed - and have done this myself "oops, > forgot to format my commit, here's a new commit that does the formatting". > That seems different to me from these wider-scale cleanups. > > > > I think I'm personally mostly in favor of this sort of stuff (though I > think I would like some community buy-in to sign off project-wide on each > clang-tidy rule/pattern/etc that's going to be applied) but it does seem > new/different from the way some things have been done in the past so I'm > curious how other people think about the differences/similarities/guiding > principles. > > > > - David > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170407/2f44bf09/attachment.html>