On Mon, Oct 13, 2014 at 3:19 PM, Chandler Carruth <chandlerc at google.com> wrote:> On Mon, Oct 13, 2014 at 3:04 PM, Nick Kledzik <kledzik at apple.com> wrote: > >> I’d like to discuss revising the LLVM coding conventions to change the >> naming of variables to start with a lowercase letter. >> > > Almost all of your negatives of the current conventions also apply to your > proposed convention. > > Type names: CamelCase > Function names: camelCase > Variable names: ??? > > If we name variables in camelCase then variable names and function names > collide. > > If we are going to change how we name variables, I very much want them to > not collide with either type names or function names. My suggestion would > be "lower_case" names. >I think this would be bad: function(); lambda(); longFunction(); long_lambda(); ... but possibly not in practice, since function names rarely have only one word. A partial-camel-case, partly-underscores convention sounds strange to me. (I don't find this to be problematic for BIG_SCARY_MACROS and for ABCK_EnumNamespaces because the former are rare and in the latter case the underscore isn't a word separator, it's a namespace separator.) We have a few people here who are used to such a style (since it's what the Google style guide and derivatives uses); any useful feedback from that experience? Some arguments against the change as proposed: 1. Initialisms. It's common in Clang code (also in LLVM?) to use initialisms as variable names. This doesn't really seem to work for names that start with a lower case letter. 2. The ambiguity introduced might be worse than the one removed. It's usually easy to see if a name is a type or variable from the context of the use. It's not so easy to see if a name is a function or a variable, especially as more variables become callable due to the prevalence of lambdas. This also happens to be the vastly most common pattern across all C++> coding styles and C-based language coding styles I have seen. > > >> This should not be a discussion on the pain of such a transition, or how >> to get from here to there, but rather, if there is a better place to be. >> >> My arguments for the change are: >> >> 1. No other popular C++ coding style uses capitalized variable names. >> For instance here are other popular C++ conventions that use camelCase: >> >> http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml >> >This does not use camelCase for variable names. http://www.c-xx.com/ccc/ccc.php>> http://geosoft.no/development/cppstyle.html >> >> And, of course, the all-lower-case conventions (e.g. C++ ARM) don’t >> capitalize variable names. In addition, all the common C derived languages >> don’t use capitalized variable names (e.g. Java, C#, Objective-C). >> >Some or all of those other conventions don't capitalize *any* names (other than perhaps macros), so we're not going to become consistent with them by making this change. 2. Ambiguity. Capitalizing type names is common across most C++>> conventions. But in LLVM variables are also capitalized which conflates >> types and variables. Starting variable names with a lowercase letter >> disambiguates variables from types. For instance, the following are >> ambiguous using LLVM’s conventions: >> >> Xxx Yyy(Zzz); // function prototype or local object construction? >> Aaa(Bbb); // function call or cast? >> >> >> 3. Allows name re-use. Since types and variables are both nouns, using >> different capitalization allows you to use the same simple name for types >> and variables, for instance: >> >> Stream stream; >> >> >> 4. Dubious history. Years ago the LLVM coding convention did not specify >> if variables were capitalized or not. Different contributors used >> different styles. Then in an effort to make the style more uniform, >> someone flipped a coin and updated the convention doc to say variables >> should be capitalized. I never saw any on-list discussion about this. >> >FWIW, I thought the argument for the current convention was: capitalize proper nouns (classes and variables), do not capitalize verbs (functions), as in English. Though maybe that's just folklore.>5. Momentum only. When I’ve talked with various contributors privately, I>> have found no one who says they likes capitalized variables. It seems like >> everyone thinks the conventions are carved in stone... >> >Momentum is an argument against the change, not in favour of it: this change has a re-learning cost for everyone who hacks on LLVM projects. (Your point that no-one seems to like capitalized variables is valid, but generally people are opposed to change too.) I would add: 6. Lower barrier to entry. Our current convention is different from almost all other C++ code, and new developers *very* frequently get it wrong. My proposal is that we modify the LLVM Coding Conventions to have variable>> names start with a lowercase letter. >> >> Index: CodingStandards.rst >> ==================================================================>> --- CodingStandards.rst (revision 219065) >> +++ CodingStandards.rst (working copy) >> @@ -1073,8 +1073,8 @@ >> nouns and start with an upper-case letter (e.g. ``TextFileReader``). >> >> * **Variable names** should be nouns (as they represent state). The >> name should >> - be camel case, and start with an upper case letter (e.g. ``Leader`` or >> - ``Boats``). >> + be camel case, and start with a lower case letter (e.g. ``leader`` or >> + ``boats``). >> >> * **Function names** should be verb phrases (as they represent actions), >> and >> command-like function should be imperative. The name should be camel >> case, >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141013/906b1292/attachment.html>
On Mon, Oct 13, 2014 at 4:08 PM, Richard Smith <richard at metafoo.co.uk> wrote:> I think this would be bad: > > function(); > lambda(); > longFunction(); > long_lambda(); > > ... but possibly not in practice, since function names rarely have only > one word. > > A partial-camel-case, partly-underscores convention sounds strange to me. > (I don't find this to be problematic for BIG_SCARY_MACROS and for > ABCK_EnumNamespaces because the former are rare and in the latter case the > underscore isn't a word separator, it's a namespace separator.) We have a > few people here who are used to such a style (since it's what the Google > style guide and derivatives uses); any useful feedback from that experience? >This has never come up as a practical problem in my time at Google. Or at least, if it has, it was so rare and long ago that I can't remember it. I don't expect it to be a problem in practice. Mostly that is because all of the problematic cases have two words in them, with one of the words often being "is" or a related obvious verb like "get", "create", etc.> > Some arguments against the change as proposed: > > 1. Initialisms. It's common in Clang code (also in LLVM?) to use > initialisms as variable names. This doesn't really seem to work for names > that start with a lower case letter. >I think wee at least need a good answer to this.> > 2. The ambiguity introduced might be worse than the one removed. It's > usually easy to see if a name is a type or variable from the context of the > use. It's not so easy to see if a name is a function or a variable, > especially as more variables become callable due to the prevalence of > lambdas. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141013/9860f890/attachment.html>
On Mon, Oct 13, 2014 at 4:14 PM, Chandler Carruth <chandlerc at google.com> wrote:> > On Mon, Oct 13, 2014 at 4:08 PM, Richard Smith <richard at metafoo.co.uk> > wrote: > >> I think this would be bad: >> >> function(); >> lambda(); >> longFunction(); >> long_lambda(); >> >> ... but possibly not in practice, since function names rarely have only >> one word. >> >> A partial-camel-case, partly-underscores convention sounds strange to me. >> (I don't find this to be problematic for BIG_SCARY_MACROS and for >> ABCK_EnumNamespaces because the former are rare and in the latter case the >> underscore isn't a word separator, it's a namespace separator.) We have a >> few people here who are used to such a style (since it's what the Google >> style guide and derivatives uses); any useful feedback from that experience? >> > > This has never come up as a practical problem in my time at Google. Or at > least, if it has, it was so rare and long ago that I can't remember it. I > don't expect it to be a problem in practice. Mostly that is because all of > the problematic cases have two words in them, with one of the words often > being "is" or a related obvious verb like "get", "create", etc. >Thanks, that's really helpful to know. Some arguments against the change as proposed:>> >> 1. Initialisms. It's common in Clang code (also in LLVM?) to use >> initialisms as variable names. This doesn't really seem to work for names >> that start with a lower case letter. >> > > I think wee at least need a good answer to this. >OK; I think if we have a good answer to this, then either variableName or variable_name works for me (though I still weakly prefer the former). -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141013/25533c90/attachment.html>
On Mon, Oct 13, 2014 at 4:14 PM, Chandler Carruth <chandlerc at google.com> wrote:> 1. Initialisms. It's common in Clang code (also in LLVM?) to use >> initialisms as variable names. This doesn't really seem to work for names >> that start with a lower case letter. >> > > I think wee at least need a good answer to this. >As I really suspect this is the most important point to address, let me make an attempt: Variable names are *either* initialisms, written as all caps, or terms written in lower case and separated by underscores. For the purposes of variable naming "terms" can include words but also extremely common and recognizable abbreviations within LLVM such as "rhs", "lhs", or "gep". These types of terms should not be written as initialisms but as words. For example, you might write "LE" or "lhs_expr" for the Left-hand Expression, but not "LHSE" or "LHS_expr". While I'm trying to avoid it, this has the advantage of leaving a large number of initialisms in the existing code base as "stylish". I'm not really happy with this rule, but it is the least disruptive and most consistent I can come up with. I would also be happy encouraging people to not use initialisms excessively or if confusing. I think the current codebase uses them more than is helpful. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141013/b5984d26/attachment.html>
On 10/13/2014 6:08 PM, Richard Smith wrote:> 1. Initialisms. It's common in Clang code (also in LLVM?) to use > initialisms as variable names. This doesn't really seem to work for > names that start with a lower case letter.In my local use of LLVM code, I tend to follow a lowercase variable naming convention. However, I have taking to using Module *M, Function *F, Instruction *I, etc. At longer abbreviations... well, I use gep, lhs, rhs, but BB, GV, SI, LI, CI. I suppose my convention ends up being that I use upper-case letters if it's referring to the current thing being processed (and there is no ambiguity as to what is meant). Either that, or I do it only for 1- or 2-character initialisms. :-) This does have the added benefit of making a distinction between i (the integer loop index count) and I (the current instruction being processed) exceptionally clear. -- Joshua Cranmer Thunderbird and DXR developer Source code archæologist
Am 14.10.2014 um 02:31 schrieb Joshua Cranmer 🐧:> On 10/13/2014 6:08 PM, Richard Smith wrote: >> 1. Initialisms. It's common in Clang code (also in LLVM?) to use >> initialisms as variable names. This doesn't really seem to work for >> names that start with a lower case letter. > In my local use of LLVM code, I tend to follow a lowercase variable > naming convention. However, I have taking to using Module *M, Function > *F, Instruction *I, etc. At longer abbreviations... well, I use gep, > lhs, rhs, but BB, GV, SI, LI, CI. I suppose my convention ends up being > that I use upper-case letters if it's referring to the current thing > being processed (and there is no ambiguity as to what is meant). Either > that, or I do it only for 1- or 2-character initialisms. :-) > > This does have the added benefit of making a distinction between i (the > integer loop index count) and I (the current instruction being > processed) exceptionally clear.I always had the impression there is an implicit naming convention along the following lines: In general, variable names are in lower case, even when abbreviated. But, if a variable's name refers to its type and the type is in the llvm or clang namespaces, then use the initials of the type in capitals. So it would be an llvm::BasicBlock called BB, an llvm::Instruction called I, and an llvm::GlobalValue called GV, but it would be rhs, because there is no class llvm::RightHandSide. This way, there is an implicit list of terms to abbreviate in capitals in the code base, namely all types in the llvm namespace. I can also imagine a short list of reserved variable names in the coding stype, e.g. I may only be used for Instructions, BB only for basic blocks, ...