camel case method names - naming-conventions

I understand the reason to camel case variable names, but I've always wondered why you would camel case a method name? why is it toString() and not ToString()? What purpose does it serve?

A lot of conventions say you capitalize the first letter of types (classes, structs, enums, etc.), and use lowercase otherwise (functions, members, etc.).
If you follow that convention, you can then tell just by looking that MyStruct.MyType refers to a nested type, and MyStruct.myData refers to some form of data, and MyStruct.myFunc() refers to a function call.

We use lower-case on the first letter to save a little ink in our printouts.

It's just a convention. Like all conventions they only serve to, in the minds of their creators, make code easier to read and maintain.

Because that's what the original designers of Java liked.

Because to be consistent you'd have to capitalize the first letter of every method name, and then you have to hit the Shift key that many more times in a day.

Since camel cases capitalizes the first letter of each word to substitute spaces, we are left with the challenge of how to differentiate a capitalized title, like we would in English for a proper noun. As a solution to this, the first word in a camel case identifier is capitalized to indicate the title or identifier is capitalized.
In the case of programming, it seems appropriate to most, to capitalize the name of a class, but not the name of its methods. In practically it provides a nice distinction between the two.
Over the years programming has evolved to have a lot of conventions, while many are very different, there is much that people tend to agree on. However, you will find that the answers to "why questions", such as the one you posted, are not always rooted in something entirely concrete.

I don't think there is any reason, these are just conventions and everyone might have his own.

If you want a function
write();
that takes less effort (one less SHIFT keypress) than
Write();
However, if you're writing to a file, you need to distinguish the words. Hence
writeToFile();
is marginally more efficient (and still consistent with the first example)

Usually you tend to follow the one that your framework uses. So Java developers tend to use lowercase to start, and .NET developers tend to use uppercase to start.

If you haven't already read the wikipedia page, it contains everything you could ever possibly want to know on camel case, including its history.
CamelCase (also spelled "camel case")
or medial capitals is the practice of
writing compound words or phrases in
which the elements are joined without
spaces, with each element's initial
letter capitalized within the
compound.
And
One theory for the origin of the camel
case convention holds that C
programmers and hackers simply found
it more convenient than the standard
underscore-based style.
C programmers lazy? I doubt that very much.

Pascal case
The first letter in the identifier and the first letter of each subsequent concatenated word are capitalized. You can use Pascal case for identifiers of three of more characters.
For example: BackColor
Uppercase
All letters in the identifier are capitalized. Use this convention only for identifiers that consist of two or less letters, or abbreviations, such as BSU and UCS. In the example below, IO and UI are the uppercase identifiers, whereas System follows the Pascal capitalization style because the length is greater than two.
For example: System.IO; System.Web.UI
Camel case
The first letter of an identifier is lowercase and the first letter of each subsequent concatenated word is capitalized.
For example: backColor
The following Link summarizes the capitalization rules and provides examples for the different types of identifiers and items within your Web application/project.Hopefully this link will helpful for you
https://dotnetprod.bsu.edu/AdminConsole/Documentation/ASPDotNet/CodingStyle/NamingConventions/Capitalization.aspx

A friend told me that a study showed that people could read code easier if Types were camel case, with an upper case first letter, and variables were done_like_this. It certainly does make the difference between types and variables jump out.
I've never known which was clearer for function names. I've generally considered capitalizing the first letter, but after reading this I think it might be more readable not to to distinguish between type names and method names (yes, in some languages a method signature is a type, but you know what I mean!)

Related

"var_name" VS "var-name" VS "varName" - Which is supported by the widest variety of languages

I have seen multi-word variables displayed in 3 ways:
With an underscore - var_name
With a hyphen - var-name
With capitalization - varName
Different languages support different ways. For example, you can use - (hyphens) in CSS classnames / IDs, but not in Javascript variable names
My ultimate question is:
Which is most widely supported across different programming languages, and are there any advantages or limitations of each?
I think it is a tie between the underscore '_' and camel case - or capitalization as you call it. Hyphen gets interpreted as an operator in many languages, so it should be avoided as a common naming tool.
All languages I have used (fortran, pascal, C/C++, Java, PHP, JS, Python) all differentiate and accept capitalization and underscores in variable names. So, it becomes a matter of preference. I was taught camelCase through school so that is what I have stuck with. It creates a bit shorter variable name than inserting underscores and is easier to type. Having said that, I think names with underscores are a bit more readable, and that is reason enough to use them over others. If I could get my brain to change it's habits, I would use underscores.

What does the word "semantic" mean in Computer Science context?

I keep coming across the use of this word and I never understand its use or the meaning being conveyed.
Phrases like...
"add semantics for those who read"
"HTML5 semantics"
"semantic web"
"semantically correctly way to..."
... confuse me and I'm not just referring to the web. Is the word just another way to say "grammar" or "syntax"?
Thanks!
Semantics are the meaning of various elements in the program (or whatever).
For example, let's look at this code:
int width, numberOfChildren;
Both of these variables are integers. From the compiler's point of view, they are exactly the same. However, judging by the names, one is the width of something, while the other is a count of some other things.
numberOfChildren = width;
Syntactically, this is 100% okay, since you can assign integers to each other. However, semantically, this is totally wrong, since the width and the number of children (probably) don't have any relationship. In this case, we'd say that this is semantically incorrect, even if the compiler permits it.
Syntax is structure. Semantics is meaning. Each different context will give a different shade of meaning to the term.
HTML 5, for example, has new tags that are meant to provide meaning to the data that is wrapped in the tags. The <aside> tag conveys that the data contained within is tangentially-related to the information around itself. See, it is meaning, not markup.
Take a look at this list of HTML 5's new semantic tags. Contrast them against the older and more familiar HTML tags like <b>, <em>, <pre>, <h1>. Each one of those will affect the appearance of HTML content as rendered in a browser, but they can't tell us why. They contain no information of meaning.
The word ‘semantic ‘as an adjective simply means ‘meaningful’ which is very related to the word 'high level' in computer science.
For instances:
Semantic data model:
a data model that is semantic, that is meaningful and understood by anyone regardless of his background or expertise.
C++ is less semantic than Java, because Java uses meaningful words for its classes, methods and fields.
HTML5 semantics: refer to the tags that describe themselves such , , and so on.
It means "meaning", what you've got left when you've already accounted for syntax and grammar. For example, in C++ i++; is defined by the grammar as a valid statement, but says nothing about what it does. "Increment i by one" is semantics.
HTML5 semantics is what a well-formed HTML5 description is supposed to put on the page. "Semantic web" is, generally, a web where links and searches are on meaning, not words. The semantically correct way to do something is how to do it so it means the right thing.
It is not just Computer Science terminology, and if you ask,
What is the meaning behind this Computer Science lingo?
then I'm afraid we'll get in a recursive loop just like this.
In the HTML world, "semantic" is used to talk about the meaning of tags, rather than just considering how the output looks. For example, it's common to italicize foreign words, and it's also common to italicize emphasized words. You could simply wrap all foreign or emphasized words in <i>..</i> tags, but that only describes how they look, it doesn't describe why they look that way.
A better tag to use for emphasized word is <em>..</em>, because it conveys the semantics of emphasis. The browser (or your stylesheet) can then render them in italics, and other consumers of the page will know the word is emphasized. For example, a screen-reader could properly read it as an emphasized word.
From my view, it's almost like looking at syntax in a grammatical way. I can't speak to semantics in a broad term, but When people talk about semantics on the web, they are normally referring to the idea that if you stripped away all of the css and javascript etc; what was left (the bare bones html) would make sense to be read.
It also takes into account using the correct tags for correct markup. This stems from the old table-based layouts (tables should only be used for tabular data), and using lists to present list-like content.
You wouldn't use an h1 for something that was less important than an h2. That wouldn't make sense.
The below is syntactically different but semantically the same:
C, C++, C#, Java, JavaScript, Python, Ruby, etc.
x += y
Perl, PHP
$x += $y

Better identifier names? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
How can I train myself to give better variable and function names (any user-defined name in a program).
Practice.
Always be thinking about it whenever you write or read code, including other people's code. Try to figure out what you would do differently in their code and talk to them about it, when possible (but don't harp on it, that would be a nuisance). Ask them questions about why they picked a certain name.
See how well your code reads without comments. If you ever need to comment on the basic purpose of something you named, consider whether it could have a better name.
The biggest thing is active mental participation: practice.
Thinking of names seems to be something that some people are extraordinarily bad at, and I'm not sure what the cure is. Back when I was an instructor working in commercial training, I'd often see situations like this:
Me: OK, now you need to create an integer variable to contain the value returned by getchar().
[Trainees start typing, and I wander round the training room. Most are doing fine, but one is sitting like a deer frozen the headlights]
Me: What's the problem?
Him: I can't think of a name for the variable!
So, I'd give them a name for it, but I have a feeling that people with this problem are not going to go far in programming. Or perhaps the problem is they go too far...
This is a subjective question.
I strive to make my code align with the libraries (or, at the least the standard ones) so that the code has an uniformity. I'd suggest: See how the standard library functions are named. Look for patterns. Learn what different naming conventions exist. See which one makes the most sense. E.g: most Java code uses really big identifier names, Camel casing etc. C uses terse/short names. There is then the Hungarian notation -- which was worth the trouble when editors weren't smart enough to provide you with type information. Nowadays, you probably don't need it.
Joel Spolsky wrote a helpful article on Hungarian notation a few years back. His key insight is this:
Let’s try to come up with a coding
convention that will ensure that if
you ever make this mistake, the code
will just look wrong. If wrong code,
at least, looks wrong, then it has a
fighting chance of getting caught by
someone working on that code or
reviewing that code.
He goes on to show how naming variables in a rigourous fashion can improve our code. The point being, that avoiding bugs has a quicker and more obvious ROI than making our code more "maintainable".
Read some good code and imitate it. That's the universal way of learning anything; just replace "read" and "code" with appropriate words.
A good way to find expressive names is starting with a textual description what your piece of software actually does.
Good candidates for function (method) names are verbs, for classes nouns.
If you do design first, one method is textual analysis.
(Even if you are only a team of 1) agree a coding standard with your colleagues so you all use the same conventions for naming. (e.g. It is common to use properties for values that are returned quickly, but a GetXXX or CalculateXXX method for values that will take time to calculate. This convention gives the caller a much better idea about whether they need to cache the results etc). Try to use the same names for equivalent concepts (e.g. Don't mix Array.Count and List.Length as Microsoft did in .net!)
Try to read your code as if somebody else wrote it (i.e. forget everything you know and just read it). Does it make sense? Does it explain everything they need to know to understand it? (Probably not, because we all forget to describe the stuff we "know" or which is "obvious". Go back and clarify the naming and documentation so that someone else can pick up your code file and easily understand it)
Keep names short but descriptive. There is no point writing a whole sentence, but with auto-completion in most IDEs, there is also little point in abbreviating anything unless it's a very standard abbreviation.
Don't waste characters telling somebody that this string is a string (the common interpretation of hungarian notation). Use names that describe what something does, and how it is used. e.g. I use prefixes to indicate the usage (m=member, i=iterator/index, p=pointer, v=volatile, s=static, etc). This is important information when accessing the variable so it's a useful addition to the name. It also allows you to copy a line of code into an email and the receiver can understand exactly what all the variables are meant to do - the difference in usage between a static volatile and a parameter is usually very important.
Describe the contents of a variable or the purpose of a method in its name, avoiding technical terms unless you know that all the readers of your code will know what those terms mean. Use the simplest description you can think of - complex words and technical terms sound intelligent and impressive, but are much more open to misinterpretation (e.g. off the top of my head: Collation or SortOrder, Serialise or Save - though these are well known words in programming so are not very good cases).
Avoid vague and near-meaningless terms like "value", "type". This is especially true of base class properties, because you end up with a "Type" in a derived class and you have no idea what kind if type it is. Use "JoystickType" or "VehicleType" and the meaning is immediately much clearer.
If you use a value with units, tell people what they are in the name (angleDegrees rather than angle). This simple trick will stop your spacecraft smashing into Mars.
For C#, C++, C in Visual Studio, try using AtomineerUtils to add documentation comments to methods, classes etc. This tool derives automatic documentation from your names, so the better your names are, the better the documentation is and the less effort you need to put in the finish the documentation off.
Read "Code Complete" book, more specifically Chapter 11 about Naming. This is the checklist (from here, free registration required):
General Naming Considerations
Does the name fully and accurately describe what the variable represents?
Does the name refer to the real-world problem rather than to the programming-language solution?
Is the name long enough that you don't have to puzzle it out?
Are computed-value qualifiers, if any, at the end of the name?
Does the name use Count or Index instead of Num?
Naming Specific Kinds Of Data
Are loop index names meaningful (something other than i, j, or k if the loop is more than one or two lines long or is nested)?
Have all "temporary" variables been renamed to something more meaningful?
Are boolean variables named so that their meanings when they're True are clear?
Do enumerated-type names include a prefix or suffix that indicates the category-for example, Color_ for Color_Red, Color_Green, Color_Blue, and so on?
Are named constants named for the abstract entities they represent rather than the numbers they refer to?
Naming Conventions
Does the convention distinguish among local, class, and global data?
Does the convention distinguish among type names, named constants, enumerated types, and variables?
Does the convention identify input-only parameters to routines in languages that don't enforce them?
Is the convention as compatible as possible with standard conventions for the language?
Are names formatted for readability?
Short Names
Does the code use long names (unless it's necessary to use short ones)?
Does the code avoid abbreviations that save only one character?
Are all words abbreviated consistently?
Are the names pronounceable?
Are names that could be mispronounced avoided?
Are short names documented in translation tables?
Common Naming Problems: Have You Avoided...
...names that are misleading?
...names with similar meanings?
...names that are different by only one or two characters?
...names that sound similar?
...names that use numerals?
...names intentionally misspelled to make them shorter?
...names that are commonly misspelled in English?
...names that conflict with standard library-routine names or with predefined variable names?
...totally arbitrary names?
...hard-to-read characters?

Case sensitive Vs insensitive syntax

Can anyone make really a good case ( :-) ) for being case sensitive?
C#: case sensitive
VB.NET: not case sensitive
C++: case sensitive
...
Worse part: XML which is used inside a language like VB.NET is case sensitive.
I was making the case that it is ridiculous and can only cause harm after we found a bug in our system due to the fact that XML had both Value and value nodes...
I am asked over and over in comments
"Perhaps you can come up with a single
argument for why case insensitive is
the right choice in such a world?"
Here is an example:
I see it analogous to the issue of: URL's should be case sensitive?
www.cnn.com <> Www.cNN.com ?
Of course they should be the same, ID theft heaven! because humans don't put that much attention to 2 strings that are the same but might have otherwise different casing. Programmers are humans. So getAge() and getage() are the same in most human's minds.
Please notice: I do not think we want the code to actually have a function defined as getAget() and then have code calling it getage(), VS (vb.net) will automatically correct getaget to getAge. So the code is clear and the programmer is aware of the correct capitalization. My point is: good IDE makes the issue non relevant, but it works better in a non case-sesnsetive language like vb.net then lets say c#.
Reference: here
Case rules depend on culture. Do you want a programming language where a variable i is sometimes considered to be the same as one called I and sometimes they're different variables? (That's not a made-up example, btw. In Turkish, I is not an upper-case i.
Honestly, it's pretty simple. Do you want the compiler to correct you when you make a typo, or do you want it to guess at what you meant? The latter leads to bugs, as you found out. VB assumes "oh, you probably meant the same thing, that's ok, we won't stop you", and XML took you literally.
Your bug didn't occur because case sensitivity is bad, it occurred because being sloppy is bad. Arbitrarily changing case may, at best, cause no problems, and at worst it will cause errors. Assume the worst, and be consistent with your case. Which, incidentally, is what case sensitive languages force you to do. Whether or not your tools are case sensitive, the programmer should be case sensitive. Being case sensitive saves you a lot of trouble as long as the world features insensitive as well as sensitive tools. If we could remake the world so that everything was case insensitive, a lot of the reasons in favor of sensitivity would go away. but we can't.
A little side note of course:
In many languages, it is common to give variables and types the same names, but with different capitalization:
Foo foo; // declare a variable foo of type Foo
Of course you could argue that "you shouldn't do that", but it's convenient, and it immediately tells the reader what type the variable has. It allows us to create a Log class, and a log object. And since the purpose of this object is to log, the name is kinda obvious.
And a final point to consider:
Case matters in real languages. A word that begins with upper-case is different from the same one but with leading lower-case. The word "worD" is not correct english. Information is encoded in the case, which makes text easier to read. It tells us when we encounter a name, for example, or when a sentence begins, which is handy. Allowing people to ignore case rules makes text harder to read. And since code should generally be written as readable as possible, why shouldn't we do the same in programming? Allow the case to encode important information. In many languages, Foo is a type, and foo is a variable. That's important information. I want to know this when I program. If I see a function called "Getage", I wonder if that's some English word I've never heard before. But when I see "GetAge", I immediately know that it should be read as the word "Get" followed by the word "Age".
By the way, here's a nice example of the fun surprises you can run into in case sensitive languages.
Slop is never a good idea in a programming language. You want things to be as specific as possible. You never want your language to guess at anything and it should allow as few ways to solve a given problem as possible.
As for a specific answer, how about readability? Doesn't stoRetroData visually differ quite a bit from storeTRodAtA? Not that anyone would do such a thing, but what's the point in allowing it?
I can't come up with any reason to allow ignoring case.
At least that's my opinion--but your mileage may vary.
Edit: I probably should have started this out with a disclaimer:
I learned to program in basic and had this same thought fleetingly about 18 years ago. Trust me, it's one of those things you'll look back on in 20 years and go "Oh, yeah, I was pretty wrong about that" (as I am right now)
History It is the way it has been done. The XML is VB.NET is case sensitive because the XML standard requires it
Internationalization Are we going to support case in all languages (French, Japanese, Hebrew, Klingon, etc.)?
Several case sensitive languages nowadays are that way because the languages they were based on were case sensitive and the transition would be easier. Personally I prefer case sensitive, but Jeff Atwood wrote a pretty good article on why case sensitivity may no longer be necessary.
Couple of reasons.
Finding things, with case-insensitive I'd have to have 'case-insensitive' flag about everywhere. With UTF-8 that should be also aware of Klingon smallcase..
More importaintly, CamelCasing, CAMelcaSing. It's not pretty, but it's used a lot and is fairly sane. Is nigh impossible with case-insensitivity.
Language parity, for example xsd.exe (shipped with VS200x) can generate you classes for xsd that you supply. What would be your "Value" named when you also have "value"? So this takes out yet another possible impedance.
Case is good in programming languages, but rather than use it in symbol names we should use it as it was originally intended- to delimit the beginning of a sentence or command or a proper name. For example:
Var test = 0;
Console.writeline(test);
Test = test + 1;
Console.writeline(test);
So beautiful,... :P
I find case-insensitivity to be just silly. You should follow the capitalization of the original declaration. I can't see any good reasons for not doing so beside being too lazy to type TheRealName instead of therealname.
In fact, I would never even consider using a case-insensitive language.

When to join name and when not to?

Most languages give guidelines to separate different words of a name by underscores (python, C etc.) or by camel-casing (Java). However the problem is when to consider the names as separate. The options are:
1) Do it at every instance when separate words from the English dictionary occur e.g. create_gui(), recv_msg(), createGui(), recvMsg() etc.
2) Use some intuition to decide when to do this and when not to do this e.g. recvmsg() is OK, but its better to have create_gui() .
What is this intuition?
The question looks trivial. But it presents a problem which is common and takes at least 5 seconds for each instance whenever it appears.
I always do your option 1, and as far as I can tell, all modern frameworks do.
One thing that comes to mind that just sticks names together is the standard C library. But its function names are often pretty cryptic anyway.
I'm probably biased as an Objective-C programmer, where things tend to be quite spelled out, but I'd never have a method like recvMsg. It would be receiveMessage (and the first parameter should be of type Message; if it's a string, then it should be receiveString or possibly receiveMessageString depending on context). When you spell things out this way, I think the question tends to go away. You would never say receivemessage.
The only time I abbreviate is when the abbreviation is more clear than the full version. createGUI is good because "GUI" (gooey) is the common way we say it in English. createGraphicalUserInterface is actually more confusing, so should be avoided.
So to the original question, I believe #1 is best, but coupled with an opposition to unclear abbreviations.
One of the most foolish naming choices ever made in Unix was creat(), making a nonsense word to save one keystroke. Code is written once and read many times, so it should be biased towards ease of reading rather than writing.
For me, and this is just me, I prefer to follow whatever is conventional for the language, thus camelCase for Java and C++, underscore for C and SQL.
But whatever you do, be consistent within any source file or project. The reader of your code will thank you; seeing an identifier that is inconsistent with most others makes the reader pause and ask "is something different going on with this identifier? Is there something here I should be noticing?"
Or in other words, follow the Principal of Least Surprise.
Edit: This got downmodded why??
Just follow coding style, such moments usually well described.
For example:
ClassNamesInCamelNotaionWithFirstLetterCapitalized
classMethod()
classMember
CONSTANTS_IN_UPPERCASE_WITH_UNDERSCORE
local_variables_in_lowercase_with_underscores