Is there a concise catalog of variable naming-conventions? - variables

There are many different styles of variable names that I've come across over the years.
The current wikipedia entry on naming conventions is fairly light...
I'd love to see a concise catalog of variable naming-conventions, identifying it by a name/description, and some examples.
If a convention is particularly favored by a certain platform community, that would be worth noting, too.
I'm turning this into a community wiki, so please create an answer for each convention, and edit as needed.

The best naming convention set that I've seen is in the book "Code Complete" Steve McConnell has a great section in there about naming conventions and lots of examples. His examples run through a number of "best practices" for different languages, but ultimately leave it up to the developer, dev manager, or architect to decide the specific action.

PEP8 is the style guide for Python code that is part of the standard library which is relatively influential in the Python community. For bonus points, in addition to covering the naming conventions used by the Python standard library, it provides an overview of naming conventions in general.
According to PEP8: variables, instance variables, functions and methods in the standard library are supposed to use lowercase words separated by underscores for readability:
foo
parse_foo
a_long_descriptive_name
To distinguish a name from a reserved word, append an underscore:
in_
and_
or_
Names that are weakly private (not imported by 'from M import *') start with a single underscore. Names whose privacy is enforced with name mangling start with double underscores.:
_some_what_private
__a_slightly_more_private_name
Names that are Python 'special' methods start and end with double underscores:
__hash__ # hash(o) = o.__hash__()
__str__ # str(o) = o.__str__()

Sun published a list of variable name conventions for Java here. The list includes conventions for naming packages, classes, interfaces, methods, variables, and constants.
The section on variables states
Except for variables, all instance, class, and class constants are in mixed case with a lowercase first letter. Internal words start with capital letters. Variable names should not start with underscore _ or dollar sign $ characters, even though both are allowed.
Variable names should be short yet meaningful. The choice of a variable name should be mnemonic- that is, designed to indicate to the casual observer the intent of its use. One-character variable names should be avoided except for temporary "throwaway" variables. Common names for temporary variables are i, j, k, m, and n for integers; c, d, and e for characters.
Some examples include
int i;
char c;
float myWidth;

Whatever your naming convention is, it usually do not allow you to easily distinguish:
instance (usually private) variable
local variables (used as parameters or local to a function)
I use a naming convention based on a the usage of indefinite article (a, an, some) for local variables, and no article for instance variables, as explained in this question about variable prefix.
So a prefix (either my way, or any other prefix listed in that question) can be one way of refining variable naming convention.
I know your question is not limited on variables, I just wanted to point out that part of naming convention.

There is, the "Java Programmers Phrasebook" by Einar Hoest.
He analysed the grammatical structure of method names together with the structure of the method bodies, and collected information such as:
add-[noun]-* These methods often have parameters and create objects.
They very rarely return field values.
The phrase appears in practically all
applications.
etcetera ... collected from hundreds of open-source projects.
For more background information, please refer to his SLE08 paper.

Related

Why do people put "my" in their variable names?

I have seen an incredibly large number of variables in our codebase at work that are along the lines of myCounter or myClassVariable. Why?
I don't just see this at work, either. I see this in tutorials online, github, blogs, etc. I would understand if it's just a placeholder for an example, but otherwise I can't imagine it being a standard of any kind.
Is this a holdover from some old standard or is it just a bad practice that snuck in before code reviews were common place? Was it an ancestor to people's usage of the underscore to indicate a _ClassName?
Starting a variable name with my indicates that it is defined by the programmer rather than a predefined variable. In a statically typed language (with explicit types) it also prevents a name clash with the type name, for instance
Collection myCollection;
A my variable name should only be used in generic examples where the variable has no other interpretation.
It may also be worth mentioning that in the programming language Perl (since version 5 from 1994) the keyword my is used to declare a local variable, for instance
my $message = "hello there";
https://perldoc.perl.org/functions/my

Good practice in naming derived types in fortran

I'd like to optimize the readability of my codes in Fortran by using OOP.
I thus use derived types. what is the best practice to name the types and derived types?
For example, is it better to:
type a
real :: var
end type
type(a) :: mya
or always begin type names by type_ like in type_a? I like this one but maybe better ideas can be foud.
Also, is it better (then why) to use short names that are less readable or longer names that end up quite difficult to read if the type has too many "levels". For example, in a%b%c%d%e, if a, b, c, d and e are 8 or more letters long as in country%hospital%service%patient%name, then once again readability seems to be a concern.
Advices from experts are really welcome.
This not anything special to Fortran. You can use coding recommendation for other languages.
Usually, type names are not designated by any prefix or suffix. In many languages class names start with a capital letter. You can use this in Fortran also, even if it is not case sensitive. Just be sure not to reuse the name with a small letter as a variable name.
One example of a good coding guideline is this and you can adapt it for Fortran very easily. Also, have a look on some Fortran examples in books like MRC or RXX. OS can be also useful.
I would recommend not to use too short component names, if the letter is not the same as used in the written equation. In that case it can be used.
Use the associate construct or pointers to make aliases to nested names like country%hospital%service%patient%name.
In my experience, naming issues come up in OO Fortran more than other languages (e.g. C++) because of Fortran's named modules, lack of namespaces, and case-insensitivity, among other things. Case-insensitivity hurts because if you name a type Foo, you cannot have a variable named foo or you will get a compiler error (tested with gfortran 4.9).
The rules I have settled on are:
Each Fortran module provides a single, primary class named Namespace_Foo.
The class Namespace_Foo can be located in your source tree as Namespace/Foo_M.f90.
Class variables are nouns with descriptive, lower case names like bar or bar_baz.
Class methods are verbs with descriptive (but short if possible) names and use a rename search => Namespace_Foo_search.
Instances of class Namespace_Foo can be named foo (without namespace) when there is no easy alternative.
These rules make it particularly easy to mirror a C/C++ class Namespace::Foo in Fortran or bind (using BIND(C)) a C++ class to Fortran. They also avoid all of the common name collisions I've run into.
Here's a working example (tested with gfortran 4.9).
module Namespace_Foo_M
implicit none
type :: Namespace_Foo
integer :: bar
real :: bar_baz
contains
procedure, pass(this) :: search => Namespace_Foo_search
end type
contains
function Namespace_Foo_search(this, offset) result(index)
class(Namespace_Foo) :: this
integer,intent(in) :: offset !input
integer :: index !return value
index = this%bar + int(this%bar_baz) + offset
end function
end module
program main
use Namespace_Foo_M !src/Namespace/Foo_M.f90
type(Namespace_Foo) :: foo
foo % bar = 1
foo % bar_baz = 7.3
print *, foo % search(3) !should print 11
end program
Note that for the purpose of running the example, you can copy/paste everything above into a single file.
Final Thoughts
I have found the lack of namespaces extremely frustrating in Fortran and the only way to hack it is to just include it in the names themselves. We have some nested "namespaces", e.g. in C++ Utils::IO::PrettyPrinter and in Fortran Utils_IO_PrettyPrinter. One reason I use CamelCase for classes, e.g. PrettyPrinter instead of Pretty_Printer, is to disambiguate what is a namespace. It does not really matter to me if namespaces are upper or lower case, but the same case should be used in the name and file path, e.g. class utils_io_PrettyPrinter should live at utils/io/PrettyPrinter_M.f90. In large/unfamiliar projects, you will spend a lot of time searching the source tree for where specific modules live and developing a convention between module name and file path can be a major time saver.

Ocaml naming convention

I am wondering if there exists already some naming conventions for Ocaml, especially for names of constructors, names of variables, names of functions, and names for labels of record.
For instance, if I want to define a type condition, do you suggest to annote its constructors explicitly (for example Condition_None) so as to know directly it is a constructor of condition?
Also how would you name a variable of this type? c or a_condition? I always hesitate to use a, an or the.
To declare a function, is it necessary to give it a name which allows to infer the types of arguments from its name, for example remove_condition_from_list: condition -> condition list -> condition list?
In addition, I use record a lot in my programs. How do you name a record so that it looks different from a normal variable?
There are really thousands of ways to name something, I would like to find a conventional one with a good taste, stick to it, so that I do not need to think before naming. This is an open discussion, any suggestion will be welcome. Thank you!
You may be interested in the Caml programming guidelines. They cover variable naming, but do not answer your precise questions.
Regarding constructor namespacing : in theory, you should be able to use modules as namespaces rather than adding prefixes to your constructor names. You could have, say, a Constructor module and use Constructor.None to avoid confusion with the standard None constructor of the option type. You could then use open or the local open syntax of ocaml 3.12, or use module aliasing module C = Constructor then C.None when useful, to avoid long names.
In practice, people still tend to use a short prefix, such as the first letter of the type name capitalized, CNone, to avoid any confusion when you manipulate two modules with the same constructor names; this often happen, for example, when you are writing a compiler and have several passes manipulating different AST types with similar types: after-parsing Let form, after-typing Let form, etc.
Regarding your second question, I would favor concision. Inference mean the type information can most of the time stay implicit, you don't need to enforce explicit annotation in your naming conventions. It will often be obvious from the context -- or unimportant -- what types are manipulated, eg. remove cond (l1 # l2). It's even less useful if your remove value is defined inside a Condition submodule.
Edit: record labels have the same scoping behavior than sum type constructors. If you have defined a {x: int; y : int} record in a Coord submodule, you access fields with foo.Coord.x outside the module, or with an alias foo.C.x, or Coord.(foo.x) using the "local open" feature of 3.12. That's basically the same thing as sum constructors.
Before 3.12, you had to write that module on each field of a record, eg. {Coord.x = 2; Coord.y = 3}. Since 3.12 you can just qualify the first field: {Coord.x = 2; y = 3}. This also works in pattern position.
If you want naming convention suggestions, look at the standard library. Beyond that you'll find many people with their own naming conventions, and it's up to you to decide who to trust (just be consistent, i.e. pick one, not many). The standard library is the only thing that's shared by all Ocaml programmers.
Often you would define a single type, or a single bunch of closely related types, in a module. So rather than having a type called condition, you'd have a module called Condition with a type t. (You should give your module some other name though, because there is already a module called Condition in the standard library!). A function to remove a condition from a list would be Condition.remove_from_list or ConditionList.remove. See for example the modules List, Array, Hashtbl,Map.Make`, etc. in the standard library.
For an example of a module that defines many types, look at Unix. This is a bit of a special case because the names are mostly taken from the preexisting C API. Many constructors have a short prefix, e.g. O_ for open_flag, SEEK_ for seek_command, etc.; this is a reasonable convention.
There's no reason to encode the type of a variable in its name. The compiler won't use the name to deduce the type. If the type of a variable isn't clear to a casual reader from the context, put a type annotation when you define it; that way the information provided to the reader is validated by the compiler.

Separate Namespaces for Functions and Variables in Common Lisp versus Scheme

Scheme uses a single namespace for all variables, regardless of whether they are bound to functions or other types of values. Common Lisp separates the two, such that the identifier "hello" may refer to a function in one context, and a string in another.
(Note 1: This question needs an example of the above; feel free to edit it and add one, or e-mail the original author with it and I will do so.)
However, in some contexts, such as passing functions as parameters to other functions, the programmer must explicitly distinguish that he's specifying a function variable, rather than a non-function variable, by using #', as in:
(sort (list '(9 A) '(3 B) '(4 C)) #'< :key #'first)
I have always considered this to be a bit of a wart, but I've recently run across an argument that this is actually a feature:
...the
important distinction actually lies in the syntax of forms, not in the
type of objects. Without knowing anything about the runtime values
involved, it is quite clear that the first element of a function form
must be a function. CL takes this fact and makes it a part of the
language, along with macro and special forms which also can (and must)
be determined statically. So my question is: why would you want the
names of functions and the names of variables to be in the same
namespace, when the primary use of function names is to appear where a
variable name would rarely want to appear?
Consider the case of class names: why should a class named FOO prevent
the use of variables named FOO? The only time I would be referring the
class by the name FOO is in contexts which expect a class name. If, on
the rare occasion I need to get the class object which is bound to the
class name FOO, there is FIND-CLASS.
This argument does make some sense to me from experience; there is a similar case in Haskell with field names, which are also functions used to access the fields. This is a bit awkward:
data Point = Point { x, y :: Double {- lots of other fields as well --} }
isOrigin p = (x p == 0) && (y p == 0)
This is solved by a bit of extra syntax, made especially nice by the NamedFieldPuns extension:
isOrigin2 Point{x,y} = (x == 0) && (y == 0)
So, to the question, beyond consistency, what are the advantages and disadvantages, both for Common Lisp vs. Scheme and in general, of a single namespace for all values versus separate ones for functions and non-function values?
The two different approaches have names: Lisp-1 and Lisp-2. A Lisp-1 has a single namespace for both variables and functions (as in Scheme) while a Lisp-2 has separate namespaces for variables and functions (as in Common Lisp). I mention this because you may not be aware of the terminology since you didn't refer to it in your question.
Wikipedia refers to this debate:
Whether a separate namespace for functions is an advantage is a source of contention in the Lisp community. It is usually referred to as the Lisp-1 vs. Lisp-2 debate. Lisp-1 refers to Scheme's model and Lisp-2 refers to Common Lisp's model. These names were coined in a 1988 paper by Richard P. Gabriel and Kent Pitman, which extensively compares the two approaches.
Gabriel and Pitman's paper titled Technical Issues of Separation in Function Cells and Value Cells addresses this very issue.
Actually, as outlined in the paper by Richard Gabriel and Kent Pitman, the debate is about Lisp-5 against Lisp-6, since there are several other namespaces already there, in the paper are mentioned type names, tag names, block names, and declaration names. edit: this seems to be incorrect, as Rainer points out in the comment: Scheme actually seems to be a Lisp-1. The following is largely unaffected by this error, though.
Whether a symbol denotes something to be executed or something to be referred to is always clear from the context. Throwing functions and variables into the same namespace is primarily a restriction: the programmer cannot use the same name for a thing and an action. What a Lisp-5 gets out of this is just that some syntactic overhead for referencing something from a different namespace than what the current context implies is avoided. edit: this is not the whole picture, just the surface.
I know that Lisp-5 proponents like the fact that functions are data, and that this is expressed in the language core. I like the fact that I can call a list "list" and a car "car" without confusing my compiler, and functions are a fundamentally special kind of data anyway. edit: this is my main point: separate namespaces are not a wart at all.
I also liked what Pascal Constanza had to say about this.
I've met a similar distinction in Python (unified namespace) vs Ruby (distinct namespaces for methods vs non-methods). In that context, I prefer Python's approach -- for example, with that approach, if I want to make a list of things, some of which are functions while others aren't, I don't have to do anything different with their names, depending on their "function-ness", for example. Similar considerations apply to all cases in which function objects are to be bandied around rather than called (arguments to, and return values from, higher-order functions, etc, etc).
Non-functions can be called, too (if their classes define __call__, in the case of Python -- a special case of "operator overloading") so the "contextual distinction" isn't necessarily clear, either.
However, my "lisp-oid" experience is/was mostly with Scheme rather than Common Lisp, so I may be subconsciously biased by the familiarity with the uniform namespace that in the end comes from that experience.
The name of a function in Scheme is just a variable with the function as its value. Whether I do (define x (y) (z y)) or (let ((x (lambda (y) (z y)))), I'm defining a function that I can call. So the idea that "a variable name would rarely want to appear there" is kind of specious as far as Scheme is concerned.
Scheme is a characteristically functional language, so treating functions as data is one of its tenets. Having functions be a type of their own that's stored like all other data is a way of carrying on the idea.
The biggest downside I see, at least for Common Lisp, is understandability. We can all agree that it uses different namespaces for variables and functions, but how many does it have? In PAIP, Norvig showed that it has "at least seven" namespaces.
When one of the language's classic books, written by a highly respected programmer, can't even say for certain in a published book, I think there's a problem. I don't have a problem with multiple namespaces, but I wish the language was, at the least, simple enough that somebody could understand this aspect of it entirely.
I'm comfortable using the same symbol for a variable and for a function, but in the more obscure areas I resort to using different names out of fear (colliding namespaces can be really hard to debug!), and that really should never be the case.
There's good things to both approaches. However, I find that when it matters, I prefer having both a function LIST and a a variable LIST than having to spell one of them incorrectly.

How is the `*var-name*` naming-convention used in clojure?

As a non-lisper coming to clojure how should I best understand the naming convention where vars get a name like *var-name*?
This appears to be a lisp convention indicating a global variable. But in clojure such vars appear in namespaces as far as I can tell.
I would really appreciate a brief explanation of what I should expect when an author has used such vars in their code, ideally with a example of how and why such a var would be used and changed in a clojure library.
It's a convention used in other Lisps, such as Common Lisp, to distinguish between special variables, as distinct from lexical variables. A special or dynamic variable has its binding stored in a dynamic environment, meaning that its current value as visible to any point in the code depends upon how it may have been bound higher up the call stack, as opposed to being dependent only on the most local lexical binding form (such as let or defn).
Note that in his book Let Over Lambda, Doug Hoyte argues against the "earmuffs" asterix convention for naming special variables. He uses an unusual macro style that makes reference to free variables, and he prefers not to commit to or distinguish whether those symbols will eventually refer to lexical or dynamic variables.
Though targeted specifically at Common Lisp, you might enjoy Ron Garret's essay The Idiot's Guide to Special Variables. Much of it can still apply to Clojure.
Functional programming is all about safe predictable functions. Infact some of us are afraid of that spooky "action at a distance" thing. When people call a function they get a warm fuzzy satisfaction that the function will always give them the same result if they call the function or read the value again. the *un-warm-and-fuzzy* bristly things exist to warn programmers that this variable is less cuddly than some of the others.
Some references I found in the Clojure newsgroups:
Re: making code readable
John D. Hume
Tue, 30 Dec 2008 08:30:57 -0800
On Mon, Dec 29, 2008 at 4:10 PM, Chouser wrote:
I believe the idiom for global values like this is to place asterisks
around the name.
I thought the asterisk convention was for variables intended for
dynamic binding. It took me a minute to figure out where I got that
idea. "Programming Clojure" suggests it (without quite saying it) in
chapter 6, section 3.
"Vars intended for dynamic binding are sometimes called special vari-
ables. It is good style to name them
with leading and trailing asterisks."
Obviously the book's a work in progress, but that does sound
reasonable. A special convention for variables whose values change (or
that my code's welcome to rebind) seems more useful to me than one for
"globals" (though I'm not sure I'd consider something like grid-size
for a given application a global). Based on ants.clj it appears Rich
doesn't feel there needs to be a special naming convention for that
sort of value.
and...
I believe the idiom for global values like this is to place asterisks
around the name. Underscores (and CamelCase) should only be used when
required for Java interop:
(def *grid-size* 10)
(def *height* 600)
(def *margin* 50)
(def *x-index* 0)
(def *y-index* 1)