MyDeveloper Day via llvm-dev
2021-Aug-09 19:36 UTC
[llvm-dev] [RFC] Adding support for clang-format making further code modifying changes
Hi all, As a frequent user and maintainer of clang-format I would like to formalize a change to the "charter" of what clang-format can/could/should do Motivation ========= As you all know clang-format parses source code and can manipulate the tokens to produce consistently formatted code (most of the time), its become ubiquitous in the industry and its integration into popular editors & IDEs such as vim/visual studio/code mean it very simple for users to get setup and running producing good looking code. clang-format does not use semantic information, and as such it doesn't need includes, defines or compiler flags to interpret code. Clang-format is generally guessing that certain sequences of tokens from the lexer represent certain patterns. It's a good guess but it gets it wrong from time to time, hence trading correctness for performance. Because of this clang-format is fast (not maybe as fast as we'd like) but fast enough to be part of in a "save" operation such that the code is formatted as the ide saves your work. Mostly clang-format manipulates only whitespace, but over the years there have been a number of extremely useful features that have broken this rule, namely include sorting, namespace commenting to name a few. The usage scenario of clang-format has also changed slightly to include a non modifying advisory role identifying clang-format violations (as in our own llvm-premerge checks), which can greatly aid the code review process by removing the need to constantly ask for the code to be formatted correctly or follow the LLVM convention. Recently a number of new features have been proposed that would also alter code, insertion of braces, east/west const conversion that can be performed at "save" speeds. As no semantic information is present, this raises the question as to whether clang-format could actually break your code. This has actually always been the case especially since the introduction of include sorting, but also we all know that clang-format can break your code from the visual perspective too and hence the need for // clang-format off/on In the most part include sorting not only might break your code noisily such that it won't compile, but it can also break it silently, and because IncludeSorting is on by default this breakage could potentially go unnoticed. https://stackoverflow.com/questions/37927553/can-clang-format-break-my-code https://travisdowns.github.io/blog/2019/11/19/toupper.html I don't think it can be in any doubt that IncludeSorting is a nice feature used by 100,000's of developers without too many issues, but there is some suggestion that its inclusion as "on by default" was potentially a mistake. Proposals for other new features that modify the code in a similar way are getting some push back for changing the "charter" of clang-format when it could be considered to have already changed. This is causing friction on the review of some features and so it was suggested to present an RFC to gain wider consensus on the concept of clang-format changing code. Mostly when a code modifying change is submitted the view is that this isn't clang-formats job but more clang-tidy, however clang-tidy requires full access to all include files and compiler definitions and only works on the preprocessor paths of the code you are analyzing for and its speed and hence its frequency of use is drastically reduced. Some clang-format based modifications can in theory be made with a relatively high level of confidence without paying the price and the configuration complexity of getting all the semantic information. https://reviews.llvm.org/D105701. There is potentially for clang-format to introduce breaking changes and whilst this could in theory cause noisy breakages they could also in theory produce silent ones. These new features want to be run at "reformat" speeds & frequency and benefit from the rich Ecosystem of inclusion and integration in IDEs and editors that clang-format enjoys. This RFC is to try to gain some consensus as to what clang-format can do and what the conditions/constraints should be if allowed to do so. Benefits ======= The benefits are that clang-format can be used to further make code modifications to adhere to a coding convention (insertion/removal of braces), clang-format could be used to validate and correct coding convention (left/right const), and could be used to remove unnecessary semicolons or potentially convert code to trailing return types all of which could be performed at "reformat" speeds. Whilst some of these capabilities are available in clang-tidy, it requires significant infrastructure to often perform these often relatively simple operations and it's unlikely that all users of clang-format are set up to perform these actions in clang-tidy. There are likely a number of clang-tidy modifications that could in theory be made at "reformat" speeds with such an approach. But there really needs some agreement that it's OK for clang-format to modify the code. Allowing these kinds of modification capabilities could lead to a new set of "Resharper" style capabilities being added to clang-format, capable of bringing source code quickly into line with coding conventions. Concerns ======= Correctness is King, the concern is your formatting tool should not perform operations that could break your code. (this is already the case) It's perhaps not clang-format's job to do this. I should personally declare myself as being in favor of allowing clang-format to modify code, I think it only fair that I let others reply to the RFC with their own concerns. Constraints ========== To minimize the impact to existing users, We suggest that a number of constraints be generally considered good practice when submitting reviews for clang-format with modifying changes 1) Code Modifying Features should always be off by default The user should have to make a positive affirmative action to use such a feature 2) Code Modifying Features configuration options should be highlighted as such in the ClangFormatStyleOptions.rst such that its clear these are potentially code breaking options 3) Existing "Code Modifying Features" do not have to adhere to 1) but the documentation should be updated to adhere to 2) 4) Code Modifying Features should be conservative to be "correct first" before being "complete". i.e. If it's possible a change could be ambiguous it should tend towards not making the incorrect change at all rather than forcing an incorrect change. (This could cause some cases to be missed) Request ====== I would like to get some general agreement that it's OK for future reviews of code modification changes to become part of clang-format (as they are in IncludeSorting) assuming the best practices are followed to protect users from unwanted changes. Feedback would be appreciated MyDeveloperDay -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210809/f3516b58/attachment.html>
Keane, Erich via llvm-dev
2021-Aug-09 19:46 UTC
[llvm-dev] [cfe-dev] [RFC] Adding support for clang-format making further code modifying changes
My thoughts (as one of the people who pushed for this RFC) are that we have a few options: 1. Clang Format is NEVER allowed to break code. This is already violated with the reordering of includes. 2. Clang Format should be able to break code at will, and we should publish this fact. 3. Clang Format should strive to not break code, but can make exceptions for good features when the examples are quite contrived, and the patch makes an as-good-as-reasonable attempt to avoid those breakages. I personally LIKE #1 in theory, but the ship has sailed, and it is at the expense of GOOD formatting options. I personally HATE #2, I think it makes the tool unusable as a formatting tool. I personally quite like #3, and believe it should be the policy going forward (which I believe is the RIGHT outcome of this RFC). I also believe the patch for left/right const already falls into this category, and thus would be generally acceptable to me (though, I think the include-reordering distinctly does NOT). -Erich From: cfe-dev <cfe-dev-bounces at lists.llvm.org> On Behalf Of MyDeveloper Day via cfe-dev Sent: Monday, August 9, 2021 12:36 PM To: clang developer list <cfe-dev at lists.llvm.org>; llvm-dev <llvm-dev at lists.llvm.org> Subject: [cfe-dev] [RFC] Adding support for clang-format making further code modifying changes Hi all, As a frequent user and maintainer of clang-format I would like to formalize a change to the "charter" of what clang-format can/could/should do Motivation ========= As you all know clang-format parses source code and can manipulate the tokens to produce consistently formatted code (most of the time), its become ubiquitous in the industry and its integration into popular editors & IDEs such as vim/visual studio/code mean it very simple for users to get setup and running producing good looking code. clang-format does not use semantic information, and as such it doesn't need includes, defines or compiler flags to interpret code. Clang-format is generally guessing that certain sequences of tokens from the lexer represent certain patterns. It's a good guess but it gets it wrong from time to time, hence trading correctness for performance. Because of this clang-format is fast (not maybe as fast as we'd like) but fast enough to be part of in a "save" operation such that the code is formatted as the ide saves your work. Mostly clang-format manipulates only whitespace, but over the years there have been a number of extremely useful features that have broken this rule, namely include sorting, namespace commenting to name a few. The usage scenario of clang-format has also changed slightly to include a non modifying advisory role identifying clang-format violations (as in our own llvm-premerge checks), which can greatly aid the code review process by removing the need to constantly ask for the code to be formatted correctly or follow the LLVM convention. Recently a number of new features have been proposed that would also alter code, insertion of braces, east/west const conversion that can be performed at "save" speeds. As no semantic information is present, this raises the question as to whether clang-format could actually break your code. This has actually always been the case especially since the introduction of include sorting, but also we all know that clang-format can break your code from the visual perspective too and hence the need for // clang-format off/on In the most part include sorting not only might break your code noisily such that it won't compile, but it can also break it silently, and because IncludeSorting is on by default this breakage could potentially go unnoticed. https://stackoverflow.com/questions/37927553/can-clang-format-break-my-code https://travisdowns.github.io/blog/2019/11/19/toupper.html I don't think it can be in any doubt that IncludeSorting is a nice feature used by 100,000's of developers without too many issues, but there is some suggestion that its inclusion as "on by default" was potentially a mistake. Proposals for other new features that modify the code in a similar way are getting some push back for changing the "charter" of clang-format when it could be considered to have already changed. This is causing friction on the review of some features and so it was suggested to present an RFC to gain wider consensus on the concept of clang-format changing code. Mostly when a code modifying change is submitted the view is that this isn't clang-formats job but more clang-tidy, however clang-tidy requires full access to all include files and compiler definitions and only works on the preprocessor paths of the code you are analyzing for and its speed and hence its frequency of use is drastically reduced. Some clang-format based modifications can in theory be made with a relatively high level of confidence without paying the price and the configuration complexity of getting all the semantic information. https://reviews.llvm.org/D105701. There is potentially for clang-format to introduce breaking changes and whilst this could in theory cause noisy breakages they could also in theory produce silent ones. These new features want to be run at "reformat" speeds & frequency and benefit from the rich Ecosystem of inclusion and integration in IDEs and editors that clang-format enjoys. This RFC is to try to gain some consensus as to what clang-format can do and what the conditions/constraints should be if allowed to do so. Benefits ======= The benefits are that clang-format can be used to further make code modifications to adhere to a coding convention (insertion/removal of braces), clang-format could be used to validate and correct coding convention (left/right const), and could be used to remove unnecessary semicolons or potentially convert code to trailing return types all of which could be performed at "reformat" speeds. Whilst some of these capabilities are available in clang-tidy, it requires significant infrastructure to often perform these often relatively simple operations and it's unlikely that all users of clang-format are set up to perform these actions in clang-tidy. There are likely a number of clang-tidy modifications that could in theory be made at "reformat" speeds with such an approach. But there really needs some agreement that it's OK for clang-format to modify the code. Allowing these kinds of modification capabilities could lead to a new set of "Resharper" style capabilities being added to clang-format, capable of bringing source code quickly into line with coding conventions. Concerns ======= Correctness is King, the concern is your formatting tool should not perform operations that could break your code. (this is already the case) It's perhaps not clang-format's job to do this. I should personally declare myself as being in favor of allowing clang-format to modify code, I think it only fair that I let others reply to the RFC with their own concerns. Constraints ========== To minimize the impact to existing users, We suggest that a number of constraints be generally considered good practice when submitting reviews for clang-format with modifying changes 1) Code Modifying Features should always be off by default The user should have to make a positive affirmative action to use such a feature 2) Code Modifying Features configuration options should be highlighted as such in the ClangFormatStyleOptions.rst such that its clear these are potentially code breaking options 3) Existing "Code Modifying Features" do not have to adhere to 1) but the documentation should be updated to adhere to 2) 4) Code Modifying Features should be conservative to be "correct first" before being "complete". i.e. If it's possible a change could be ambiguous it should tend towards not making the incorrect change at all rather than forcing an incorrect change. (This could cause some cases to be missed) Request ====== I would like to get some general agreement that it's OK for future reviews of code modification changes to become part of clang-format (as they are in IncludeSorting) assuming the best practices are followed to protect users from unwanted changes. Feedback would be appreciated MyDeveloperDay -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210809/a39960b4/attachment.html>
David Blaikie via llvm-dev
2021-Aug-09 19:54 UTC
[llvm-dev] [cfe-dev] [RFC] Adding support for clang-format making further code modifying changes
+some of the folks involved in the early development of clang-format who might have more context on those design principles/choices made back then (& perhaps on the include reordering change/tradeoff/choices made there) Anyone got a link to the include reordering change/design discussion for it? Might be informative to see if there was precedent/philosophy clearly stated about how that was acceptable and what principles were applied & could be reapplied to more recent proposals. I tend towards the status quo (in general - that's just my personality) and tend to prefer the fairly firm line clang-format currently has. But I don't have a great rationale for "why include reordering but not const reordering or dropping braces, etc" - marginally: Changing C++ syntax is more complicated than reordering headers. Despite the fact that reordering headers can have a big impact/breakage, it's hopefully fairly clear if/when that happens that headers should be made order-agnostic. The subtleties of what might happen when adding/removing braces if the heuristic quick-parsing of clang-format fail seem likely to be more unfortunate, to me at least. But maybe we're at a point where clang-format is widely adopted/tested enough that that's less likely to have subtle mis-parse breakages for terribly long? Not sure. - Dave On Mon, Aug 9, 2021 at 12:36 PM MyDeveloper Day via cfe-dev < cfe-dev at lists.llvm.org> wrote:> Hi all, > > As a frequent user and maintainer of clang-format I would like to > formalize a change to the "charter" of what clang-format can/could/should do > > Motivation > =========> > As you all know clang-format parses source code and can manipulate the > tokens to produce consistently formatted code (most of the time), its > become ubiquitous in the industry and its integration into > popular editors & IDEs such as vim/visual studio/code mean it very simple > for users to get setup and running producing good looking code. > > clang-format does not use semantic information, and as such it doesn't > need includes, defines or compiler flags to interpret code. Clang-format is > generally guessing that certain sequences of tokens from the lexer > represent certain patterns. It's a good guess but it gets it wrong from > time to time, hence trading correctness for performance. > > Because of this clang-format is fast (not maybe as fast as we'd like) but > fast enough to be part of in a "save" operation such that the code is > formatted as the ide saves your work. > > Mostly clang-format manipulates only whitespace, but over the years there > have been a number of extremely useful features that have broken this rule, > namely include sorting, namespace commenting to name a few. > > The usage scenario of clang-format has also changed slightly to include a > non modifying advisory role identifying clang-format violations (as in our > own llvm-premerge checks), which can greatly aid the code review process by > removing the need to constantly ask for the code to be formatted correctly > or follow the LLVM convention. > > Recently a number of new features have been proposed that would also alter > code, insertion of braces, east/west const conversion that can be performed > at "save" speeds. > > As no semantic information is present, this raises the question as to > whether clang-format could actually break your code. > This has actually always been the case especially since the introduction > of include sorting, but also we all know that clang-format can break your > code from the visual perspective too and hence the need for // clang-format > off/on > > In the most part include sorting not only might break your code noisily > such that it won't compile, but it can also break it silently, > and because IncludeSorting is on by default this breakage could > potentially go unnoticed. > > https://stackoverflow.com/questions/37927553/can-clang-format-break-my-code > https://travisdowns.github.io/blog/2019/11/19/toupper.html > > I don't think it can be in any doubt that IncludeSorting is a nice feature > used by 100,000's of developers without too many issues, but there is some > suggestion that its inclusion as "on by default" was potentially a mistake. > > Proposals for other new features that modify the code in a similar way are > getting some push back for changing the "charter" of clang-format when it > could be considered to have already changed. > This is causing friction on the review of some features and so it was > suggested to present an RFC to gain wider consensus on the concept of > clang-format changing code. > > Mostly when a code modifying change is submitted the view is that this > isn't clang-formats job but more clang-tidy, however clang-tidy requires > full access to all include files and compiler definitions and only works on > the preprocessor paths of the code you are analyzing for and its speed and > hence its frequency of use is drastically reduced. > > Some clang-format based modifications can in theory be made with a > relatively high level of confidence without paying the price and the > configuration complexity of getting all the > semantic information. https://reviews.llvm.org/D105701. > There is potentially for clang-format to introduce breaking changes and > whilst this could in theory cause noisy breakages they could also in theory > produce silent ones. > > These new features want to be run at "reformat" speeds & frequency and > benefit from the rich Ecosystem of inclusion and integration in IDEs and > editors that clang-format enjoys. > > This RFC is to try to gain some consensus as to what clang-format can do > and what the conditions/constraints should be if allowed to do so. > > Benefits > =======> > The benefits are that clang-format can be used to further make code > modifications to adhere to a coding convention (insertion/removal of > braces), > clang-format could be used to validate and correct coding convention > (left/right const), and could be used to remove unnecessary semicolons or > potentially convert code to trailing return types all of which could be > performed at "reformat" speeds. > > Whilst some of these capabilities are available in clang-tidy, it requires > significant infrastructure to often perform these often relatively simple > operations and it's unlikely > that all users of clang-format are set up to perform these actions in > clang-tidy. > > There are likely a number of clang-tidy modifications that could in theory > be made at "reformat" speeds with such an approach. But there really needs > some agreement that it's OK for clang-format to modify the code. > > Allowing these kinds of modification capabilities could lead to a new set > of "Resharper" style capabilities being added to clang-format, > capable of bringing source code quickly into line with coding conventions. > > Concerns > =======> > Correctness is King, the concern is your formatting tool should not > perform operations that could break your code. (this is already the case) > > It's perhaps not clang-format's job to do this. > > I should personally declare myself as being in favor of allowing > clang-format to modify code, I think it only fair that I let others reply > to the RFC with their own concerns. > > Constraints > ==========> > To minimize the impact to existing users, We suggest that a number of > constraints be generally considered good practice when submitting reviews > for clang-format with modifying changes > > 1) Code Modifying Features should always be off by default > The user should have to make a positive affirmative action to use such a > feature > > 2) Code Modifying Features configuration options should be highlighted as > such in the ClangFormatStyleOptions.rst such that its clear these are > potentially code breaking options > > 3) Existing "Code Modifying Features" do not have to adhere to 1) but the > documentation should be updated to adhere to 2) > > 4) Code Modifying Features should be conservative to be "correct first" > before being "complete". > i.e. If it's possible a change could be ambiguous it should tend towards > not making the incorrect change at all rather than forcing an incorrect > change. (This could cause some > cases to be missed) > > Request > ======> > I would like to get some general agreement that it's OK for future reviews > of code modification changes to become part of clang-format (as they are in > IncludeSorting) assuming the best practices are > followed to protect users from unwanted changes. > > Feedback would be appreciated > > MyDeveloperDay > > > > > > > > > > > > > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210809/1e3432dd/attachment-0001.html>
Andrew Tomazos via llvm-dev
2021-Aug-09 20:44 UTC
[llvm-dev] [cfe-dev] [RFC] Adding support for clang-format making further code modifying changes
I'd just offer a few thoughts: - Whether or not to adopt clang-format for a project is already quite controversial in some places. I think if we made it so that it also made more extensive non-whitespace changes, I think it would reduce its adoption, even if they were off by default. - Non-whitespace changes are very dangerous. A bug in clang-format could more easily create bugs in the code it formats (at least that is the perception). For some risk-averse projects, this would be an unacceptable risk. - Perhaps an alternative idea would be to create a new program called clang-reshape or something, that shared a common underlying framework with clang-format, but clang-reshape contains the options for both the non-whitespace changes and whitespace-changes, whereas clang-format only has the whitespace changes. That way folks could adopt clang-format and ban clang-reshape if they only wanted whitespace changes. HTH. -Andrew On Mon, Aug 9, 2021 at 7:36 PM MyDeveloper Day via cfe-dev < cfe-dev at lists.llvm.org> wrote:> Hi all, > > As a frequent user and maintainer of clang-format I would like to > formalize a change to the "charter" of what clang-format can/could/should do > > Motivation > =========> > As you all know clang-format parses source code and can manipulate the > tokens to produce consistently formatted code (most of the time), its > become ubiquitous in the industry and its integration into > popular editors & IDEs such as vim/visual studio/code mean it very simple > for users to get setup and running producing good looking code. > > clang-format does not use semantic information, and as such it doesn't > need includes, defines or compiler flags to interpret code. Clang-format is > generally guessing that certain sequences of tokens from the lexer > represent certain patterns. It's a good guess but it gets it wrong from > time to time, hence trading correctness for performance. > > Because of this clang-format is fast (not maybe as fast as we'd like) but > fast enough to be part of in a "save" operation such that the code is > formatted as the ide saves your work. > > Mostly clang-format manipulates only whitespace, but over the years there > have been a number of extremely useful features that have broken this rule, > namely include sorting, namespace commenting to name a few. > > The usage scenario of clang-format has also changed slightly to include a > non modifying advisory role identifying clang-format violations (as in our > own llvm-premerge checks), which can greatly aid the code review process by > removing the need to constantly ask for the code to be formatted correctly > or follow the LLVM convention. > > Recently a number of new features have been proposed that would also alter > code, insertion of braces, east/west const conversion that can be performed > at "save" speeds. > > As no semantic information is present, this raises the question as to > whether clang-format could actually break your code. > This has actually always been the case especially since the introduction > of include sorting, but also we all know that clang-format can break your > code from the visual perspective too and hence the need for // clang-format > off/on > > In the most part include sorting not only might break your code noisily > such that it won't compile, but it can also break it silently, > and because IncludeSorting is on by default this breakage could > potentially go unnoticed. > > https://stackoverflow.com/questions/37927553/can-clang-format-break-my-code > https://travisdowns.github.io/blog/2019/11/19/toupper.html > > I don't think it can be in any doubt that IncludeSorting is a nice feature > used by 100,000's of developers without too many issues, but there is some > suggestion that its inclusion as "on by default" was potentially a mistake. > > Proposals for other new features that modify the code in a similar way are > getting some push back for changing the "charter" of clang-format when it > could be considered to have already changed. > This is causing friction on the review of some features and so it was > suggested to present an RFC to gain wider consensus on the concept of > clang-format changing code. > > Mostly when a code modifying change is submitted the view is that this > isn't clang-formats job but more clang-tidy, however clang-tidy requires > full access to all include files and compiler definitions and only works on > the preprocessor paths of the code you are analyzing for and its speed and > hence its frequency of use is drastically reduced. > > Some clang-format based modifications can in theory be made with a > relatively high level of confidence without paying the price and the > configuration complexity of getting all the > semantic information. https://reviews.llvm.org/D105701. > There is potentially for clang-format to introduce breaking changes and > whilst this could in theory cause noisy breakages they could also in theory > produce silent ones. > > These new features want to be run at "reformat" speeds & frequency and > benefit from the rich Ecosystem of inclusion and integration in IDEs and > editors that clang-format enjoys. > > This RFC is to try to gain some consensus as to what clang-format can do > and what the conditions/constraints should be if allowed to do so. > > Benefits > =======> > The benefits are that clang-format can be used to further make code > modifications to adhere to a coding convention (insertion/removal of > braces), > clang-format could be used to validate and correct coding convention > (left/right const), and could be used to remove unnecessary semicolons or > potentially convert code to trailing return types all of which could be > performed at "reformat" speeds. > > Whilst some of these capabilities are available in clang-tidy, it requires > significant infrastructure to often perform these often relatively simple > operations and it's unlikely > that all users of clang-format are set up to perform these actions in > clang-tidy. > > There are likely a number of clang-tidy modifications that could in theory > be made at "reformat" speeds with such an approach. But there really needs > some agreement that it's OK for clang-format to modify the code. > > Allowing these kinds of modification capabilities could lead to a new set > of "Resharper" style capabilities being added to clang-format, > capable of bringing source code quickly into line with coding conventions. > > Concerns > =======> > Correctness is King, the concern is your formatting tool should not > perform operations that could break your code. (this is already the case) > > It's perhaps not clang-format's job to do this. > > I should personally declare myself as being in favor of allowing > clang-format to modify code, I think it only fair that I let others reply > to the RFC with their own concerns. > > Constraints > ==========> > To minimize the impact to existing users, We suggest that a number of > constraints be generally considered good practice when submitting reviews > for clang-format with modifying changes > > 1) Code Modifying Features should always be off by default > The user should have to make a positive affirmative action to use such a > feature > > 2) Code Modifying Features configuration options should be highlighted as > such in the ClangFormatStyleOptions.rst such that its clear these are > potentially code breaking options > > 3) Existing "Code Modifying Features" do not have to adhere to 1) but the > documentation should be updated to adhere to 2) > > 4) Code Modifying Features should be conservative to be "correct first" > before being "complete". > i.e. If it's possible a change could be ambiguous it should tend towards > not making the incorrect change at all rather than forcing an incorrect > change. (This could cause some > cases to be missed) > > Request > ======> > I would like to get some general agreement that it's OK for future reviews > of code modification changes to become part of clang-format (as they are in > IncludeSorting) assuming the best practices are > followed to protect users from unwanted changes. > > Feedback would be appreciated > > MyDeveloperDay > > > > > > > > > > > > > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210809/de7f0608/attachment.html>
David Blaikie via llvm-dev
2021-Aug-09 20:53 UTC
[llvm-dev] [cfe-dev] [RFC] Adding support for clang-format making further code modifying changes
On Mon, Aug 9, 2021 at 1:44 PM Andrew Tomazos via cfe-dev < cfe-dev at lists.llvm.org> wrote:> I'd just offer a few thoughts: > > - Whether or not to adopt clang-format for a project is already quite > controversial in some places. I think if we made it so that it also made > more extensive non-whitespace changes, I think it would reduce > its adoption, even if they were off by default. >Yeah? I wouldn't've figured that. - Non-whitespace changes are very dangerous. A bug in clang-format could> more easily create bugs in the code it formats (at least that is the > perception). For some risk-averse projects, this would be an > unacceptable risk. > > - Perhaps an alternative idea would be to create a new program called > clang-reshape or something, that shared a common underlying framework with > clang-format, but clang-reshape contains the options for both the > non-whitespace changes and whitespace-changes, whereas clang-format only > has the whitespace changes. That way folks could adopt clang-format and > ban clang-reshape if they only wanted whitespace changes. >This crossed my mind too - but I expect a lot of the clang-format integration may be more invasive than that. (hardcoded binary names, use of clang-format as a library, etc) - adding new features to the existing adopted clang-format may be especially valuable compared to providing a new tool. (that said, I do also feel like grouping features in an existing tool only because of its existing adoption is super great either) - Dave> > HTH. -Andrew > > > On Mon, Aug 9, 2021 at 7:36 PM MyDeveloper Day via cfe-dev < > cfe-dev at lists.llvm.org> wrote: > >> Hi all, >> >> As a frequent user and maintainer of clang-format I would like to >> formalize a change to the "charter" of what clang-format can/could/should do >> >> Motivation >> =========>> >> As you all know clang-format parses source code and can manipulate the >> tokens to produce consistently formatted code (most of the time), its >> become ubiquitous in the industry and its integration into >> popular editors & IDEs such as vim/visual studio/code mean it very simple >> for users to get setup and running producing good looking code. >> >> clang-format does not use semantic information, and as such it doesn't >> need includes, defines or compiler flags to interpret code. Clang-format is >> generally guessing that certain sequences of tokens from the lexer >> represent certain patterns. It's a good guess but it gets it wrong from >> time to time, hence trading correctness for performance. >> >> Because of this clang-format is fast (not maybe as fast as we'd like) but >> fast enough to be part of in a "save" operation such that the code is >> formatted as the ide saves your work. >> >> Mostly clang-format manipulates only whitespace, but over the years there >> have been a number of extremely useful features that have broken this rule, >> namely include sorting, namespace commenting to name a few. >> >> The usage scenario of clang-format has also changed slightly to include a >> non modifying advisory role identifying clang-format violations (as in our >> own llvm-premerge checks), which can greatly aid the code review process by >> removing the need to constantly ask for the code to be formatted correctly >> or follow the LLVM convention. >> >> Recently a number of new features have been proposed that would also >> alter code, insertion of braces, east/west const conversion that can be >> performed at "save" speeds. >> >> As no semantic information is present, this raises the question as to >> whether clang-format could actually break your code. >> This has actually always been the case especially since the introduction >> of include sorting, but also we all know that clang-format can break your >> code from the visual perspective too and hence the need for // clang-format >> off/on >> >> In the most part include sorting not only might break your code noisily >> such that it won't compile, but it can also break it silently, >> and because IncludeSorting is on by default this breakage could >> potentially go unnoticed. >> >> >> https://stackoverflow.com/questions/37927553/can-clang-format-break-my-code >> https://travisdowns.github.io/blog/2019/11/19/toupper.html >> >> I don't think it can be in any doubt that IncludeSorting is a nice >> feature used by 100,000's of developers without too many issues, but there >> is some suggestion that its inclusion as "on by default" was potentially a >> mistake. >> >> Proposals for other new features that modify the code in a similar way >> are getting some push back for changing the "charter" of clang-format when >> it could be considered to have already changed. >> This is causing friction on the review of some features and so it was >> suggested to present an RFC to gain wider consensus on the concept of >> clang-format changing code. >> >> Mostly when a code modifying change is submitted the view is that this >> isn't clang-formats job but more clang-tidy, however clang-tidy requires >> full access to all include files and compiler definitions and only works on >> the preprocessor paths of the code you are analyzing for and its speed and >> hence its frequency of use is drastically reduced. >> >> Some clang-format based modifications can in theory be made with a >> relatively high level of confidence without paying the price and the >> configuration complexity of getting all the >> semantic information. https://reviews.llvm.org/D105701. >> There is potentially for clang-format to introduce breaking changes and >> whilst this could in theory cause noisy breakages they could also in theory >> produce silent ones. >> >> These new features want to be run at "reformat" speeds & frequency and >> benefit from the rich Ecosystem of inclusion and integration in IDEs and >> editors that clang-format enjoys. >> >> This RFC is to try to gain some consensus as to what clang-format can do >> and what the conditions/constraints should be if allowed to do so. >> >> Benefits >> =======>> >> The benefits are that clang-format can be used to further make code >> modifications to adhere to a coding convention (insertion/removal of >> braces), >> clang-format could be used to validate and correct coding convention >> (left/right const), and could be used to remove unnecessary semicolons or >> potentially convert code to trailing return types all of which could be >> performed at "reformat" speeds. >> >> Whilst some of these capabilities are available in clang-tidy, it >> requires significant infrastructure to often perform these often relatively >> simple operations and it's unlikely >> that all users of clang-format are set up to perform these actions in >> clang-tidy. >> >> There are likely a number of clang-tidy modifications that could in >> theory be made at "reformat" speeds with such an approach. But there really >> needs some agreement that it's OK for clang-format to modify the code. >> >> Allowing these kinds of modification capabilities could lead to a new set >> of "Resharper" style capabilities being added to clang-format, >> capable of bringing source code quickly into line with coding conventions. >> >> Concerns >> =======>> >> Correctness is King, the concern is your formatting tool should not >> perform operations that could break your code. (this is already the case) >> >> It's perhaps not clang-format's job to do this. >> >> I should personally declare myself as being in favor of allowing >> clang-format to modify code, I think it only fair that I let others reply >> to the RFC with their own concerns. >> >> Constraints >> ==========>> >> To minimize the impact to existing users, We suggest that a number of >> constraints be generally considered good practice when submitting reviews >> for clang-format with modifying changes >> >> 1) Code Modifying Features should always be off by default >> The user should have to make a positive affirmative action to use such a >> feature >> >> 2) Code Modifying Features configuration options should be highlighted as >> such in the ClangFormatStyleOptions.rst such that its clear these are >> potentially code breaking options >> >> 3) Existing "Code Modifying Features" do not have to adhere to 1) but the >> documentation should be updated to adhere to 2) >> >> 4) Code Modifying Features should be conservative to be "correct first" >> before being "complete". >> i.e. If it's possible a change could be ambiguous it should tend towards >> not making the incorrect change at all rather than forcing an incorrect >> change. (This could cause some >> cases to be missed) >> >> Request >> ======>> >> I would like to get some general agreement that it's OK for future >> reviews of code modification changes to become part of clang-format (as >> they are in IncludeSorting) assuming the best practices are >> followed to protect users from unwanted changes. >> >> Feedback would be appreciated >> >> MyDeveloperDay >> >> >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> cfe-dev mailing list >> cfe-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >> > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210809/55978b1f/attachment.html>
Nicolai Hähnle via llvm-dev
2021-Aug-09 21:06 UTC
[llvm-dev] [RFC] Adding support for clang-format making further code modifying changes
Aren't there two separate discussions here: 1. What is the set of features in clang-format? 2. What is the subset of those features used in the LLVM (and other "major") coding styles? For me personally, the answers to those questions are quite different. For (2), I feel that static analysis tools which aim for broad adoption must keep false positives to a minimum, otherwise the tool will just end up being ignored. clang-format is arguably a static analysis tool, albeit a very simple-minded one. So I wouldn't want to run the risk of false positives in the default LLVM style. For (1), if some users really want to have such features for their own, non-LLVM coding style, then I'd say it's perfectly fine for clang-format to include those features as long as the majority of users aren't affected directly or indirectly. Indirect effects could be caused e.g. by losing some performance ("save speed" is a valuable feature) or by making clang-format harder to maintain. Cheers, Nicolai On Mon, Aug 9, 2021 at 9:36 PM MyDeveloper Day via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi all, > > As a frequent user and maintainer of clang-format I would like to > formalize a change to the "charter" of what clang-format can/could/should do > > Motivation > =========> > As you all know clang-format parses source code and can manipulate the > tokens to produce consistently formatted code (most of the time), its > become ubiquitous in the industry and its integration into > popular editors & IDEs such as vim/visual studio/code mean it very simple > for users to get setup and running producing good looking code. > > clang-format does not use semantic information, and as such it doesn't > need includes, defines or compiler flags to interpret code. Clang-format is > generally guessing that certain sequences of tokens from the lexer > represent certain patterns. It's a good guess but it gets it wrong from > time to time, hence trading correctness for performance. > > Because of this clang-format is fast (not maybe as fast as we'd like) but > fast enough to be part of in a "save" operation such that the code is > formatted as the ide saves your work. > > Mostly clang-format manipulates only whitespace, but over the years there > have been a number of extremely useful features that have broken this rule, > namely include sorting, namespace commenting to name a few. > > The usage scenario of clang-format has also changed slightly to include a > non modifying advisory role identifying clang-format violations (as in our > own llvm-premerge checks), which can greatly aid the code review process by > removing the need to constantly ask for the code to be formatted correctly > or follow the LLVM convention. > > Recently a number of new features have been proposed that would also alter > code, insertion of braces, east/west const conversion that can be performed > at "save" speeds. > > As no semantic information is present, this raises the question as to > whether clang-format could actually break your code. > This has actually always been the case especially since the introduction > of include sorting, but also we all know that clang-format can break your > code from the visual perspective too and hence the need for // clang-format > off/on > > In the most part include sorting not only might break your code noisily > such that it won't compile, but it can also break it silently, > and because IncludeSorting is on by default this breakage could > potentially go unnoticed. > > https://stackoverflow.com/questions/37927553/can-clang-format-break-my-code > https://travisdowns.github.io/blog/2019/11/19/toupper.html > > I don't think it can be in any doubt that IncludeSorting is a nice feature > used by 100,000's of developers without too many issues, but there is some > suggestion that its inclusion as "on by default" was potentially a mistake. > > Proposals for other new features that modify the code in a similar way are > getting some push back for changing the "charter" of clang-format when it > could be considered to have already changed. > This is causing friction on the review of some features and so it was > suggested to present an RFC to gain wider consensus on the concept of > clang-format changing code. > > Mostly when a code modifying change is submitted the view is that this > isn't clang-formats job but more clang-tidy, however clang-tidy requires > full access to all include files and compiler definitions and only works on > the preprocessor paths of the code you are analyzing for and its speed and > hence its frequency of use is drastically reduced. > > Some clang-format based modifications can in theory be made with a > relatively high level of confidence without paying the price and the > configuration complexity of getting all the > semantic information. https://reviews.llvm.org/D105701. > There is potentially for clang-format to introduce breaking changes and > whilst this could in theory cause noisy breakages they could also in theory > produce silent ones. > > These new features want to be run at "reformat" speeds & frequency and > benefit from the rich Ecosystem of inclusion and integration in IDEs and > editors that clang-format enjoys. > > This RFC is to try to gain some consensus as to what clang-format can do > and what the conditions/constraints should be if allowed to do so. > > Benefits > =======> > The benefits are that clang-format can be used to further make code > modifications to adhere to a coding convention (insertion/removal of > braces), > clang-format could be used to validate and correct coding convention > (left/right const), and could be used to remove unnecessary semicolons or > potentially convert code to trailing return types all of which could be > performed at "reformat" speeds. > > Whilst some of these capabilities are available in clang-tidy, it requires > significant infrastructure to often perform these often relatively simple > operations and it's unlikely > that all users of clang-format are set up to perform these actions in > clang-tidy. > > There are likely a number of clang-tidy modifications that could in theory > be made at "reformat" speeds with such an approach. But there really needs > some agreement that it's OK for clang-format to modify the code. > > Allowing these kinds of modification capabilities could lead to a new set > of "Resharper" style capabilities being added to clang-format, > capable of bringing source code quickly into line with coding conventions. > > Concerns > =======> > Correctness is King, the concern is your formatting tool should not > perform operations that could break your code. (this is already the case) > > It's perhaps not clang-format's job to do this. > > I should personally declare myself as being in favor of allowing > clang-format to modify code, I think it only fair that I let others reply > to the RFC with their own concerns. > > Constraints > ==========> > To minimize the impact to existing users, We suggest that a number of > constraints be generally considered good practice when submitting reviews > for clang-format with modifying changes > > 1) Code Modifying Features should always be off by default > The user should have to make a positive affirmative action to use such a > feature > > 2) Code Modifying Features configuration options should be highlighted as > such in the ClangFormatStyleOptions.rst such that its clear these are > potentially code breaking options > > 3) Existing "Code Modifying Features" do not have to adhere to 1) but the > documentation should be updated to adhere to 2) > > 4) Code Modifying Features should be conservative to be "correct first" > before being "complete". > i.e. If it's possible a change could be ambiguous it should tend towards > not making the incorrect change at all rather than forcing an incorrect > change. (This could cause some > cases to be missed) > > Request > ======> > I would like to get some general agreement that it's OK for future reviews > of code modification changes to become part of clang-format (as they are in > IncludeSorting) assuming the best practices are > followed to protect users from unwanted changes. > > Feedback would be appreciated > > MyDeveloperDay > > > > > > > > > > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Lerne, wie die Welt wirklich ist, aber vergiss niemals, wie sie sein sollte. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210809/8a4d1e38/attachment.html>
Sam McCall via llvm-dev
2021-Aug-09 22:23 UTC
[llvm-dev] [cfe-dev] [RFC] Adding support for clang-format making further code modifying changes
I'm cautiously +1 on const reordering, having previously opposed it and been convinced. I think anyone who's worked on a large shared codebase both before and after clang-format can understand the value here, so I'll focus mostly on the risks and why I think they're acceptable. *Risk: *clang-format will become a grab-bag of features with no clear line - just anything implemented on top of its pseudo-AST. Clang-format's brand is low-level formatting details and I think it's important to preserve this. Const order fits here in users' minds. (So does brace addition/removal). *Risk*: The feature will break code and clang-format will no longer be (seen as) reliable. This can make it harder socially or technically to deploy, and cause real damage. I think we need to work hard on mitigating this: - the feature needs careful design and extra scrutiny, like security-critical code - it should be clearly and temporarily marked as experimental, with opt-in required - it should be clearly and permanently marked as "makes assumptions about coding style", with opt-in required. - bugs need to be thoughtfully addressed>From what I can see MyDeveloperDay is serious about doing all of this.*Risk*: clang-format will be overtaken by the complexity of such features, which will outweigh the benefits (if few use them), hurting maintenance, causing bugs etc. However this isn't different from other optional features. Editing tokens tends to be done as a separate pass which is relatively easy to isolate (compared to something like supporting a new language). With complexity isolated, this is mostly just about how maintainers prioritize their time/attention, which must be left up to them. Regarding include-ordering: I think this is a valuable feature if you follow a coding style that allows it to be correct, and it fits well in clang-format's brand. However it wasn't clearly labeled to emphasize its caveats, and in hindsight it shouldn't have been made part of the Google style without further opt-in required. Cheers, Sam -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210810/4e2556c6/attachment.html>
Aaron Ballman via llvm-dev
2021-Aug-10 12:33 UTC
[llvm-dev] [cfe-dev] [RFC] Adding support for clang-format making further code modifying changes
Thank you for putting this RFC together! Given how important clang-format is to various people's workflows and the *massive* user base for the tool (who largely aren't represented here on the mailing lists), I think it's good that we are having a broader discussion about tool expectations for clang-format. I come down on the side of not wanting my code formatting tool to be opinionated about the input source code I feed it. For my personal needs, I need to be able to trust my formatting tool won't modify the meaning of my code. If it modifies the meaning in a way that gets compile errors, that's bad (it means I'm having to work around a helper tool in order to pass my CI pipeline, which is highly frustrating). I know that clang-format already happily breaks code with include sorting, and I think it was a mistake to add that functionality, especially given the lack of discussion about breaking code during the original review of the feature. Changes that can *silently* change the meaning of my code are an even worse concern because of the increased risk of not noticing the semantic change. (Keep in mind that some workflows will format an entire file when saving the file, so there may not even be direct user interaction involved in the formatting.) Putting potentially breaking changes behind configuration flags is a solution, but not one that I'm all that excited by. There are already *a lot* of configuration options for clang-format, and having to remember which ones may destroy my input source is a burden. We could use a naming convention to clearly identify all of these options, but the first time we have a configuration option that doesn't follow the naming convention, we're back to the "I can't know which options are high-risk" problem. Also, the fact that configuration options can be impacted by non-local configuration files elsewhere in the file system hierarchy can cause surprising configuration option behavior (I know we've run into the situation before where an inherited config option caused a bit of head-scratching as to where the option was set). I think the idea of a separate tool that's built on top of clang-format (consuming clang-format as a library with additional formatting features) has the most appeal to me. Then we no longer have to worry about the config options being the gate -- the tool is the deciding factor as to whether you want to opt into danger mode or not. One of the architectural benefits of designing a series of composable libraries is the ability to compose them into more powerful tools, I think we should take advantage of that. This also provides a migration path for more experimental functionality -- it can be implemented in the more dangerous tool, shake out the bugs from there, and then be shuffled into the safer tool if/when the issues have been worked out. Experience has taught me that just about the worst thing a tool can do is break user trust, and once you break that trust, you almost never get it back again. clang-format is a fantastic tool, but the more opinionated it becomes about source input, the less people are able to use it (by definition). ~Aaron On Mon, Aug 9, 2021 at 3:36 PM MyDeveloper Day via cfe-dev <cfe-dev at lists.llvm.org> wrote:> > Hi all, > > As a frequent user and maintainer of clang-format I would like to formalize a change to the "charter" of what clang-format can/could/should do > > Motivation > =========> > As you all know clang-format parses source code and can manipulate the tokens to produce consistently formatted code (most of the time), its become ubiquitous in the industry and its integration into > popular editors & IDEs such as vim/visual studio/code mean it very simple for users to get setup and running producing good looking code. > > clang-format does not use semantic information, and as such it doesn't need includes, defines or compiler flags to interpret code. Clang-format is generally guessing that certain sequences of tokens from the lexer represent certain patterns. It's a good guess but it gets it wrong from time to time, hence trading correctness for performance. > > Because of this clang-format is fast (not maybe as fast as we'd like) but fast enough to be part of in a "save" operation such that the code is formatted as the ide saves your work. > > Mostly clang-format manipulates only whitespace, but over the years there have been a number of extremely useful features that have broken this rule, namely include sorting, namespace commenting to name a few. > > The usage scenario of clang-format has also changed slightly to include a non modifying advisory role identifying clang-format violations (as in our own llvm-premerge checks), which can greatly aid the code review process by removing the need to constantly ask for the code to be formatted correctly or follow the LLVM convention. > > Recently a number of new features have been proposed that would also alter code, insertion of braces, east/west const conversion that can be performed at "save" speeds. > > As no semantic information is present, this raises the question as to whether clang-format could actually break your code. > This has actually always been the case especially since the introduction of include sorting, but also we all know that clang-format can break your code from the visual perspective too and hence the need for // clang-format off/on > > In the most part include sorting not only might break your code noisily such that it won't compile, but it can also break it silently, > and because IncludeSorting is on by default this breakage could potentially go unnoticed. > > https://stackoverflow.com/questions/37927553/can-clang-format-break-my-code > https://travisdowns.github.io/blog/2019/11/19/toupper.html > > I don't think it can be in any doubt that IncludeSorting is a nice feature used by 100,000's of developers without too many issues, but there is some suggestion that its inclusion as "on by default" was potentially a mistake. > > Proposals for other new features that modify the code in a similar way are getting some push back for changing the "charter" of clang-format when it could be considered to have already changed. > This is causing friction on the review of some features and so it was suggested to present an RFC to gain wider consensus on the concept of clang-format changing code. > > Mostly when a code modifying change is submitted the view is that this isn't clang-formats job but more clang-tidy, however clang-tidy requires full access to all include files and compiler definitions and only works on the preprocessor paths of the code you are analyzing for and its speed and hence its frequency of use is drastically reduced. > > Some clang-format based modifications can in theory be made with a relatively high level of confidence without paying the price and the configuration complexity of getting all the > semantic information. https://reviews.llvm.org/D105701. > There is potentially for clang-format to introduce breaking changes and whilst this could in theory cause noisy breakages they could also in theory produce silent ones. > > These new features want to be run at "reformat" speeds & frequency and benefit from the rich Ecosystem of inclusion and integration in IDEs and editors that clang-format enjoys. > > This RFC is to try to gain some consensus as to what clang-format can do and what the conditions/constraints should be if allowed to do so. > > Benefits > =======> > The benefits are that clang-format can be used to further make code modifications to adhere to a coding convention (insertion/removal of braces), > clang-format could be used to validate and correct coding convention (left/right const), and could be used to remove unnecessary semicolons or > potentially convert code to trailing return types all of which could be performed at "reformat" speeds. > > Whilst some of these capabilities are available in clang-tidy, it requires significant infrastructure to often perform these often relatively simple operations and it's unlikely > that all users of clang-format are set up to perform these actions in clang-tidy. > > There are likely a number of clang-tidy modifications that could in theory be made at "reformat" speeds with such an approach. But there really needs some agreement that it's OK for clang-format to modify the code. > > Allowing these kinds of modification capabilities could lead to a new set of "Resharper" style capabilities being added to clang-format, > capable of bringing source code quickly into line with coding conventions. > > Concerns > =======> > Correctness is King, the concern is your formatting tool should not perform operations that could break your code. (this is already the case) > > It's perhaps not clang-format's job to do this. > > I should personally declare myself as being in favor of allowing clang-format to modify code, I think it only fair that I let others reply to the RFC with their own concerns. > > Constraints > ==========> > To minimize the impact to existing users, We suggest that a number of constraints be generally considered good practice when submitting reviews for clang-format with modifying changes > > 1) Code Modifying Features should always be off by default > The user should have to make a positive affirmative action to use such a feature > > 2) Code Modifying Features configuration options should be highlighted as such in the ClangFormatStyleOptions.rst such that its clear these are potentially code breaking options > > 3) Existing "Code Modifying Features" do not have to adhere to 1) but the documentation should be updated to adhere to 2) > > 4) Code Modifying Features should be conservative to be "correct first" before being "complete". > i.e. If it's possible a change could be ambiguous it should tend towards not making the incorrect change at all rather than forcing an incorrect change. (This could cause some > cases to be missed) > > Request > ======> > I would like to get some general agreement that it's OK for future reviews of code modification changes to become part of clang-format (as they are in IncludeSorting) assuming the best practices are > followed to protect users from unwanted changes. > > Feedback would be appreciated > > MyDeveloperDay > > > > > > > > > > > > > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Chris Lattner via llvm-dev
2021-Aug-10 19:02 UTC
[llvm-dev] [cfe-dev] [RFC] Adding support for clang-format making further code modifying changes
First of all, thank you so much for writing up this clear and mostly balanced summary of the situation, the concerns, and the decision point ahead here. I think there is a something fundamental problem to the nature of this discussion: we can all make semi-informed guesses about how well a particular format will work in practice, but we can’t know until there is usage experience. Furthermore, clang-format has multiple different communities with different tradeoffs, and imposing a new format on an existing codebase has concerns. Clang format is already highly parameterized to support this, are you saying that there is pushback on adding new off-by-default capabilities to clang-format, or is the pushback about adding these capabilities to existing language modes (llvm style, google style, etc)? I don’t see an obvious problem with introducing off-by-default capabilities. I can see a reasonable concern about introducing “const moving” or other new things into existing formats by default - that could be disruptive, and depends a lot about how well the algorithm and implementation works in practice. Two thoughts on how to make progress here: 1) You could implement these things in an off-by-default setting, ship it out to lots of people, then gain data somehow (e.g. ask for feedback). 2) We could introduce a new top level “changing my code is allowed” mode to clang-format and put these checks into that. You could even conceptually move namespace commenting and include sorting to that mode, making clang-format more consistent. -Chris> On Aug 9, 2021, at 12:36 PM, MyDeveloper Day via cfe-dev <cfe-dev at lists.llvm.org> wrote: > > Hi all, > > As a frequent user and maintainer of clang-format I would like to formalize a change to the "charter" of what clang-format can/could/should do > > Motivation > =========> > As you all know clang-format parses source code and can manipulate the tokens to produce consistently formatted code (most of the time), its become ubiquitous in the industry and its integration into > popular editors & IDEs such as vim/visual studio/code mean it very simple for users to get setup and running producing good looking code. > > clang-format does not use semantic information, and as such it doesn't need includes, defines or compiler flags to interpret code. Clang-format is generally guessing that certain sequences of tokens from the lexer represent certain patterns. It's a good guess but it gets it wrong from time to time, hence trading correctness for performance. > > Because of this clang-format is fast (not maybe as fast as we'd like) but fast enough to be part of in a "save" operation such that the code is formatted as the ide saves your work. > > Mostly clang-format manipulates only whitespace, but over the years there have been a number of extremely useful features that have broken this rule, namely include sorting, namespace commenting to name a few. > > The usage scenario of clang-format has also changed slightly to include a non modifying advisory role identifying clang-format violations (as in our own llvm-premerge checks), which can greatly aid the code review process by removing the need to constantly ask for the code to be formatted correctly or follow the LLVM convention. > > Recently a number of new features have been proposed that would also alter code, insertion of braces, east/west const conversion that can be performed at "save" speeds. > > As no semantic information is present, this raises the question as to whether clang-format could actually break your code. > This has actually always been the case especially since the introduction of include sorting, but also we all know that clang-format can break your code from the visual perspective too and hence the need for // clang-format off/on > > In the most part include sorting not only might break your code noisily such that it won't compile, but it can also break it silently, > and because IncludeSorting is on by default this breakage could potentially go unnoticed. > > https://stackoverflow.com/questions/37927553/can-clang-format-break-my-code <https://stackoverflow.com/questions/37927553/can-clang-format-break-my-code> > https://travisdowns.github.io/blog/2019/11/19/toupper.html <https://travisdowns.github.io/blog/2019/11/19/toupper.html> > > I don't think it can be in any doubt that IncludeSorting is a nice feature used by 100,000's of developers without too many issues, but there is some suggestion that its inclusion as "on by default" was potentially a mistake. > > Proposals for other new features that modify the code in a similar way are getting some push back for changing the "charter" of clang-format when it could be considered to have already changed. > This is causing friction on the review of some features and so it was suggested to present an RFC to gain wider consensus on the concept of clang-format changing code. > > Mostly when a code modifying change is submitted the view is that this isn't clang-formats job but more clang-tidy, however clang-tidy requires full access to all include files and compiler definitions and only works on the preprocessor paths of the code you are analyzing for and its speed and hence its frequency of use is drastically reduced. > > Some clang-format based modifications can in theory be made with a relatively high level of confidence without paying the price and the configuration complexity of getting all the > semantic information. https://reviews.llvm.org/D105701 <https://reviews.llvm.org/D105701>. > There is potentially for clang-format to introduce breaking changes and whilst this could in theory cause noisy breakages they could also in theory produce silent ones. > > These new features want to be run at "reformat" speeds & frequency and benefit from the rich Ecosystem of inclusion and integration in IDEs and editors that clang-format enjoys. > > This RFC is to try to gain some consensus as to what clang-format can do and what the conditions/constraints should be if allowed to do so. > > Benefits > =======> > The benefits are that clang-format can be used to further make code modifications to adhere to a coding convention (insertion/removal of braces), > clang-format could be used to validate and correct coding convention (left/right const), and could be used to remove unnecessary semicolons or > potentially convert code to trailing return types all of which could be performed at "reformat" speeds. > > Whilst some of these capabilities are available in clang-tidy, it requires significant infrastructure to often perform these often relatively simple operations and it's unlikely > that all users of clang-format are set up to perform these actions in clang-tidy. > > There are likely a number of clang-tidy modifications that could in theory be made at "reformat" speeds with such an approach. But there really needs some agreement that it's OK for clang-format to modify the code. > > Allowing these kinds of modification capabilities could lead to a new set of "Resharper" style capabilities being added to clang-format, > capable of bringing source code quickly into line with coding conventions. > > Concerns > =======> > Correctness is King, the concern is your formatting tool should not perform operations that could break your code. (this is already the case) > > It's perhaps not clang-format's job to do this. > > I should personally declare myself as being in favor of allowing clang-format to modify code, I think it only fair that I let others reply to the RFC with their own concerns. > > Constraints > ==========> > To minimize the impact to existing users, We suggest that a number of constraints be generally considered good practice when submitting reviews for clang-format with modifying changes > > 1) Code Modifying Features should always be off by default > The user should have to make a positive affirmative action to use such a feature > > 2) Code Modifying Features configuration options should be highlighted as such in the ClangFormatStyleOptions.rst such that its clear these are potentially code breaking options > > 3) Existing "Code Modifying Features" do not have to adhere to 1) but the documentation should be updated to adhere to 2) > > 4) Code Modifying Features should be conservative to be "correct first" before being "complete". > i.e. If it's possible a change could be ambiguous it should tend towards not making the incorrect change at all rather than forcing an incorrect change. (This could cause some > cases to be missed) > > Request > ======> > I would like to get some general agreement that it's OK for future reviews of code modification changes to become part of clang-format (as they are in IncludeSorting) assuming the best practices are > followed to protect users from unwanted changes. > > Feedback would be appreciated > > MyDeveloperDay > > > > > > > > > > > > > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210810/8ce45bcb/attachment.html>