What is readable code? What are the best practices to follow while naming variables? - naming-conventions

Do you think x, y, z are good variable names? How will you explain a new programmer to write readable code?

Readable code means some combination of comments and variable and function naming that allows me to read the code once and understand it. If I have to read it more than once, or spend my time working through complicated loops or functions, there's room for improvement.
Good summary descriptions at the top of files and classes are useful to give the reader context and background information.
Clear names are important. Verbose names make it much easier to write readable code with far fewer comments.
Writing readable code is a skill that takes some time to learn. I personally like overly verbose names because they create self documenting code.

As already stated x, y, and z are good variables for 3D coordinates but probably bad for anything else...
If someone does not believe that names are important, just use a code obfuscator on some code then ask them to debug it :-).
(BTW that's the only situation where a code obfuscator can be useful IMHO)

There seems to be slightly different conventions per progamming language; however, the consensus these days is to...
use pascal case
make the name meaningful
end with a noun
Here is a decent recap of what Microsoft publishes as standard naming conventions for .NET
The inventor of python has published a style guide which includes naming conventions.
There was a time when Microsoft VC++ developers (myself included) actually rallied around what was known as Hungarian Notation

Certainly there are multiple schools of thought on this, but I would only use these for counters, and advise far more descriptive names for any other variables.

x, y and z can be perfectly good variable names. For example you might be writing code that refers to them in reference to a 3D cartesian coordinate system. These names are often used for the three axes in such a system and as such they would be well suited.

I would give them some maintenance work on some code with variables called x, y, z and let them realise for themselves that readability is vital...
95% of code viewing is not by the author, but by the customer that everyone forgets about - the next programmer. You owe it to her to make her life easy.

Good variable names describe exactly what they are without being overly complex. I always use descriptive names, even in loops (for instance, index instead of i). It helps keep track of what's going on, especially when I'm working on rewriting a particularly complex piece of code.

Well give them a chunk of bad code and ask them to debug it.
Take the following code (simple example)
<?php $a = fopen('/path/to/file.ext', 'w');$b = "NEW LINE\n";fwrite($a, $b);fclose($a);?>
The bug is: File only ever contains 1 line when it should be a log
Problem: 'w' in fopen should be 'a'
This obviously is a super easy example, if you want to give them a bigger more complicated example give them the WMD source and ask them to give you readable code in 2 hours, it will get your point across.

As long as x, y and z are (3D) Cartesian co-ordinates, then they're great names.
In a similar vein, i, j and k would be OK for loop variables.
In all cases, the variable names should relate to the data

x,y and z are acceptable variable names if they represent 3d coordinates, or if they're used for iterating over 2 or 3 dimensional arrays.
This code is fine as far as I'm concerned:
for(int x = 0; x < xsize ; x++)
{
for(int y = 0; y < ysize ; y++)
{
for(int z = 0; z < zsize ; z++)
{
DoSomething(data[x][y][z]);
...

This one is a short answer, but it works very well for me:
If it would need a code comment to describe it, then rethink the variable name.
So if it's obvious, why "x" was choosen, then they are good names. E.g. "i" as variable name in a loop is (often) pretty obvious.

An ideal variable name is both short (to make the code more compact) and descriptive (to help understanding the code).
Opinions differ on which of the two is more important. Personally, I'd say it depends on the scope of the variable. A variable used only inside a 3 line loop can get away with being single letter. A class field in a 500 line class better be pretty damn descriptive. The Spartan Programming philosophy says that as far as possible, all units of code should be small enough that variable names can be very short.

Readable code and good naming conventions are not the same thing!
A good name for a variable is one that allows you to understand (or reasonably guess) the purpose and type of the variable WITHOUT seeing the context in which it is used. Thus, "x" "y" and "z" say coordinates because that is a reasonable guess. Conversely, a bad name is one that leads you to a wrong likely guess. For example, if "x" "y" and "z" represent people.
A good name for a function is one that conveys everything you would need to know about it without having to consult its documentation. That is not always possible.
Readable code is first of all code whose structured could be understood even if you obfuscated all variable and function names. If you do that and can't figure out the control structure easily, you're screwed.
Once you have readable code and good naming, then maybe you'll have truly readable code.

Related

Is it acceptable to use `to` to create a `Pair`?

to is an infix function within the standard library. It can be used to create Pairs concisely:
0 to "hero"
in comparison with:
Pair(0, "hero")
Typically, it is used to initialize Maps concisely:
mapOf(0 to "hero", 1 to "one", 2 to "two")
However, there are other situations in which one needs to create a Pair. For instance:
"to be or not" to "be"
(0..10).map { it to it * it }
Is it acceptable, stylistically, to (ab)use to in this manner?
Just because some language features are provided does not mean they are better over certain things. A Pair can be used instead of to and vice versa. What becomes a real issue is that, does your code still remain simple, would it require some reader to read the previous story to understand the current one? In your last map example, it does not give a hint of what it's doing. Imagine someone reading { it to it * it}, they would be most likely confused. I would say this is an abuse.
to infix offer a nice syntactical sugar, IMHO it should be used in conjunction with a nicely named variable that tells the reader what this something to something is. For example:
val heroPair = Ironman to Spiderman //including a 'pair' in the variable name tells the story what 'to' is doing.
Or you could use scoping functions
(Ironman to Spiderman).let { heroPair -> }
I don't think there's an authoritative answer to this.  The only examples in the Kotlin docs are for creating simple constant maps with mapOf(), but there's no hint that to shouldn't be used elsewhere.
So it'll come down to a matter of personal taste…
For me, I'd be happy to use it anywhere it represents a mapping of some kind, so in a map{…} expression would seem clear to me, just as much as in a mapOf(…) list.  Though (as mentioned elsewhere) it's not often used in complex expressions, so I might use parentheses to keep the precedence clear, and/or simplify the expression so they're not needed.
Where it doesn't indicate a mapping, I'd be much more hesitant to use it.  For example, if you have a method that returns two values, it'd probably be clearer to use an explicit Pair.  (Though in that case, it'd be clearer still to define a simple data class for the return value.)
You asked for personal perspective so here is mine.
I found this syntax is a huge win for simple code, especial in reading code. Reading code with parenthesis, a lot of them, caused mental stress, imagine you have to review/read thousand lines of code a day ;(

Add spaces between words in spaceless string

I'm on OS X, and in objective-c I'm trying to convert
for example,
"Bobateagreenapple"
into
"Bob ate a green apple"
Is there any way to do this efficiently? Would something involving a spell checker work?
EDIT: Just some extra information:
I'm attempting to build something that takes some misformatted text (for example, text copy pasted from old pdfs that end up without spaces, especially from internet archives like JSTOR). Since the misformatted text is probably going to be long... well, I'm just trying to figure out whether this is feasibly possible before I actually attempt to actually write system only to find out it takes 2 hours to fix a paragraph of text.
One possibility, which I will describe this in a non-OS specific manner, is to perform a search through all the possible words that make up the collection of letters.
Basically you chop off the first letter of your letter collection and add it to the current word you are forming. If it makes a word (eg dictionary lookup) then add it to the current sentence. If you manage to use up all the letters in your collection and form words out of all of them, then you have a full sentence. But, you don't have to stop here. Instead, you keep running, and eventually you will produce all possible sentences.
Pseudo-code would look something like this:
FindWords(vector<Sentence> sentences, Sentence s, Word w, Letters l)
{
if (l.empty() and w.empty())
add s to sentences;
return;
if (l.empty())
return;
add first letter from l to w;
if w in dictionary
{
add w to s;
FindWords(sentences, s, empty word, l)
remove w from s
}
FindWords(sentences, s, w, l)
put last letter from w back onto l
}
There are, of course, a number of optimizations you could perform to make it go fast. For instance checking if the word is the stem of any word in the dictionary. But, this is the basic approach that will give you all possible sentences.
Solving this problem is much harder than anything you'll find in a framework. Notice that even in your example, there are other "solutions": "Bob a tea green apple," for one.
A very naive (and not very functional) approach might be to use a spell-checker to try to isolate one "real word" at a time in the string; of course, in this example, that would only work because "Bob" happens to be an English word.
This is not to say that there is no way to accomplish what you want, but the way you phrase this question indicates to me that it might be a lot more complicated than what you're expecting. Maybe someone can give you an acceptable solution, but I bet they'll need to know a lot more about what exactly you're trying to do.
Edit: in response to your edit, it would probably take less effort to run some kind of OCR tool on a PDF and correct its output than it would just to correct what this system might give you, let alone program it
I implemented a solution, the code is avaible on code project:
http://www.codeproject.com/Tips/704003/How-to-add-spaces-between-spaceless-strings
My idea was to prioritize results that use up most of the characters (preferable all of them) then favor the ones with the longest words, because 2,3 or 4 character long words can often come up by chance from leftout characters. Most of the times this provides the correct solution.
To find all possible permutations I used recursion. The code is quite fast even with big dictionaries (tested with 50 000 words).

can a variable have multiple values

In algebra if I make the statement x + y = 3, the variables I used will hold the values either 2 and 1 or 1 and 2. I know that assignment in programming is not the same thing, but I got to wondering. If I wanted to represent the value of, say, a quantumly weird particle, I would want my variable to have two values at the same time and to have it resolve into one or the other later. Or maybe I'm just dreaming?
Is it possible to say something like i = 3 or 2;?
This is one of the features planned for Perl 6 (junctions), with syntax that should look like my $a = 1|2|3;
If ever implemented, it would work intuitively, like $a==1 being true at the same time as $a==2. Also, for example, $a+1 would give you a value of 2|3|4.
This feature is actually available in Perl5 as well through Perl6::Junction and Quantum::Superpositions modules, but without the syntax sugar (through 'functions' all and any).
At least for comparison (b < any(1,2,3)) it was also available in Microsoft Cω experimental language, however it was not documented anywhere (I just tried it when I was looking at Cω and it just worked).
You can't do this with native types, but there's nothing stopping you from creating a variable object (presuming you are using an OO language) which has a range of values or even a probability density function rather than an actual value.
You will also need to define all the mathematical operators between your variables and your variables and native scalars. Same goes for the equality and assignment operators.
numpy arrays do something similar for vectors and matrices.
That's also the kind of thing you can do in Prolog. You define rules that constraint your variables and then let Prolog resolve them ...
It takes some time to get used to it, but it is wonderful for certain problems once you know how to use it ...
Damien Conways Quantum::Superpositions might do what you want,
https://metacpan.org/pod/Quantum::Superpositions
You might need your crack-pipe however.
What you're asking seems to be how to implement a Fuzzy Logic system. These have been around for some time and you can undoubtedly pick up a library for the common programming languages quite easily.
You could use a struct and handle the operations manualy. Otherwise, no a variable only has 1 value at a time.
A variable is nothing more than an address into memory. That means a variable describes exactly one place in memory (length depending on the type). So as long as we have no "quantum memory" (and we dont have it, and it doesnt look like we will have it in near future), the answer is a NO.
If you want to program and to modell this behaviour, your way would be to use a an array (with length equal to the number of max. multiple values). With this comes the increased runtime, hence the computations must be done on each of the values (e.g. x+y, must compute with 2 different values x1+y1, x2+y2, x1+y2 and x2+y1).
In Perl , you can .
If you use Scalar::Util , you can have a var take 2 values . One if it's used in string context , and another if it's used in a numerical context .

Good Examples of Hungarian Notation? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
This question is to seek out good examples of Hungarian Notation, so we can bring together a collection of these.
Edit: I agree that Hungarian for types isn't that necessary, I'm hoping for more specific examples where it increases readability and maintainability, like Joel gives in his article (as per my answer).
The problem with asking for good examples of Hungarian Notation is that everyone's going to have their own idea of what a good example looks like. My personal opinion is that the best Hungarian Notation is no Hungarian Notation. The notation was originally meant to denote the intended usage of a variable rather than its type but it's usually used for type information, particularly for Form controls (e.g., txtFirstName for a text box for someone's first name.). This makes the code less maintainable, in terms of readability (e.g., "prepIn nounTerms prepOf nounReadability") and refactoring for when the type needs to be changed (there are "lParams" in the Win32 API that have changed type).
You should probably consider not using it at all. Examples:
strFirstName - this can just be firstName since it's obvious what it's for, the type isn't that important and should be obvious in this case. If not obvious, the IDE can help you with that.
txtFirstName - this can change to FirstNameTextBox or FirstName_TextBox. It reads better and you know it's a control and not just the text.
CAccount - C was used for class names in MFC but you really don't need it. Account is good enough. The uppercase name is the standard convention for types (and they only appear in specific places so they won't get confused with properties or methods)
ixArray (index to array) - ix is a bit obscure. Try arrayIndex.
usState (unsafe string for State) - looks like "U.S. State". Better go with state_UnsafeString or something. Maybe even wrap it in an UnsafeString class to at least make it type-safe.
The now classic article, as mentioned in other Hungarian posts, is the one from Joel's site:
http://www.joelonsoftware.com/articles/Wrong.html
p
(for pointer). Its pretty much the only prefix I use. I think it adds a lot to a variable (eg that its a pointer) and so should be treated a little more respectfully.
Hungarian for datatypes is somewhat passe now IDEs can tell you what the type is (in only a few seconds hovering over the variable name), so its not so important. But treating a pointer as if its data is not good, so you want to make sure it's obvious to the user what it is even if he makes assumptions he shouldn't when coding.
t
Tainted data. Prefix all data incoming from an untrusted source to make that variable as tainted. All tainted data should be cleansed before any real work is done on it.
It's pointless to use Hungarian to indicate types because the compiler already does it for you.
Where Hungarian is useful is to distinguish between logically different sorts of variables that have the same raw type. For example, if you are using ints to represent coordinates, you could prefix x coordinates with x, y coordinates with y and distances with d. So you would have code that looks like
dxHighlight = xStart - xEnd
yHighlight = yLocation + 3
yEnd = yStart + dyHeight
dyCode = dyField * 2
and so on. It's useful because you can spot errors at a glance: If you add a dy to a y, you always get a y. If you subtract two x's you always get a dx. If you multiply a dy by a scalar, you always get a dy. And so on. If you see a line like
yTop = dyText + xButton
you know at a glance that it is wrong because adding a dy and a x does not make sense. The compiler could not catch this for you because as far as it can tell, you are adding an int to an int which is fine.
Do not use language specific prefixes.
We use:
n: Number
p: Percentage 1=100% (for interest rates etc)
c: Currency
s: String
d: date
e: enumeration
o: object (Customer oCustomer=new Customer();)
...
We use the same system for all languages:
SQL
C
C#
Javascript
VB6
VB.net
...
It is a life saver.
Devil's Advocate: The best example of Hungarian notation is not to use it. :D
We do not gain any advantage to using Hungarian notation with modern IDEs because they know the type. It adds work when refactoring a type for a variable since the name would also have to be changed (and most of the time when you are dealing with a variable you know what type it is anyway).
You can also get into ordering issues with the notation. If you use p for pointer and a for address do you call your variable apStreet or paStreet? Readability is diminished when you don't have consistency, and you have to use up valuable mind space when you have to remember the order that you have to write the notation in.
I find hungarian notation can sometimes be useful in dynamic languages. I'm specifically thinking of Server Side Actionscript (essentially just javascript), but it could apply elsewhere. Since there's no real type information at all, hungarian notation can sometimes help make things a bit easier to understand.
Hungarian notation (camel casing, as I learned it) is invaluable when you're inheriting a software project.
Yes, you can 'hover' over a variable with your IDE and find out what class it is, but if you're paging through several thousand lines of code you don't want to have to stop for those few seconds - every.... single.... time....
Remember - you're not writing code for you or your team alone. You're also writing it for the person who has to pick up this code 2-5 years down the road and enhance it.
I was strongly against Hungarian notation until I really started reading about it and trying to understand it's original intent.
After reading Joels post "Wrong" and the article "Rediscovering Hungarian Notation" I really changed my mind. Done correct I belive it must be extremly powerful.
Wrong by Joel Spolsky
http://www.joelonsoftware.com/articles/Wrong.html
Rediscovering Hungarian Notation
http://codingthriller.blogspot.com/2007/11/rediscovering-hungarian-notation.html
I belive that most Naysayers have never tried it for real and do not truly understand it.
I would love to try it out in a real project.
I think the key thing to take away from Joel's article, linked above, and Hungarian Notation in general, is to use it when there's something non-obvious about the variable.
One example, from the article, is encoded vs non encoded strings, it's not that you should use hungarian 'us' for unsafe strings and 's' for safe strings, it's that you should have some identifier to indicate that a string is either safe or not. If it becomes standard, it becomes easy to see when the standard is being broken.
The only Hungarian that's really useful anymore is m_ for member variables. (I also use sm_ for static members, because that's the "other" scope that still exists.) With widescreen monitors and compilers that take eight-billion-character-long variable names, abbreviating type names just isn't worth it.
m
When using an ORM (such as hibernate) you tend to deal managed and unmanaged objects. Changing an managed object will be reflected in the database without calling an explicit save, while dealing with a managaged object requires an explicit save call. How you deal with the object will be different depending on which it is.
I find that the only helpful point is when declaring interface controls, txtUsername, txtPassword, ddlBirthMonth. It isn't perfect, but it helps on large forms/projects.
I don't use it for variables or other items, just controls.
In addition to using 'p' for pointer, I like the idea of using 'cb' and 'cch' to indicate whether a buffer size parameter (or variable) is a count of bytes or a character count (I've also seen - rarely - 'ce' used to indicate a count of elements). So instead of conveying type, the prefix conveys use or intent.
I admit, I don't use the prefix as consistently as I probably should, but I like the idea.
A very old question, but here's a couple of "Hungarian" prefixes I use regularly:
my
for local variables, to distinguish locality where the name might make sense in a global context. If you see myFoo, it's only used in this function, regardless of anything else we do with Foos anywhere else.
myStart = GetTime();
doComplicatedOperations();
print (GetTime() - myStart);
and
tmp
for temporary copies of values in loops or multi-step operations. If you see two tmpFoo variables more than a couple of lines from each other, they're almost certainly unrelated.
tmpX = X;
tmpY = Y;
X = someCalc(tmpX, tmpY);
Y = otherCalc(tmpX, tmpY);
and sometimes old and new in for similar reasons to tmp, usually in longer loops or functions.
Well, I use it only with window control variables. I use btn_, txt_, lbl_ etc to spot them. I also find it helpful to look up the control's name by typing its type (btn_ etc).
I only ever use p for a pointer, and that's it. And that's only if I'm in C++. In C# I don't use any hungarian notation.
e.g.
MyClass myClass;
MyClass* pMyClass;
That's all :)
Edit: Oh, I just realised that's a lie. I use "m_" for member variables too. e.g.
class
{
private:
bool m_myVar;
}
I agree that Hungarian notation is no longer particularly useful. I thought that its original intention was to indicate not datatype, but rather entity type. In a code section involving the names of customers, employees and the user, for example, you could name local string variables cusName, empName and usrName. That would help distinguish among similar-sounding variable names. The same prefixes for the entities would be used throughout the application. However, when OO is used, and you're dealing with objects, those prefixes are redundant in Customer.Name, Employee.Name and User.Name.
The name of the variable should describe what it is. Good variable naming makes Hungarian notation useless.
However, sometimes you'd use Hungarian notation in addition to good variable naming. m_numObjects has two "prefixes:" m_ and num. m_ indicates the scope: it's a data member tied to this. num indicates what the value is.
I don't feel hindered at all when I read "good" code, even if it does contain some "Hungarian." Right: I read code, I don't click it. (In fact, I hardly use my mouse ever when coding, or any voodoo programming-specific lookup features.)
I am slowed when I read things like m_ubScale (yes, I'm looking at you, Liran!), as I have to look at its usage (no comments!) to find out what it scales (if at all?) and it's datatype (which happens to be a fixed-point char). A better name would be m_scaleFactor or m_zoomFactor, with a comment as a fixed-point number, or even a typedef. (In fact, a typedef would be useful, as there are several other members of several classes which use the same fixed-point format. However, some don't, but are still labeled m_ubWhatever! Confusing, to say the least.)
I think Hungarian was meant to be an additive to the variable name, not a replacement for information. Also, many times Hungarian notation adds nothing at all to the variable's readability, wasting bytes and read time.
Just my 2¢.
There's no such thing as a good example of hungarian notation. Just don't use it. Not even if you are using a weakly typed language. You'll live happier.
But if you really need some reason not to use it, this is my favourite one, extracted from this great link:
One followon trick in the Hungarian notation is "change the type of a variable but leave the variable name unchanged". This is almost invariably done in windows apps with the migration from Win16 :- WndProc(HWND hW, WORD wMsg, WORD wParam, LONG lParam) to Win32 WndProc(HWND hW, UINT wMsg, WPARAM wParam, LPARAM lParam) where the w values hint that they are words, but they really refer to longs. The real value of this approach comes clear with the Win64 migration, when the parameters will be 64 bits wide, but the old "w" and "l" prefixes will remain forever.
I find myself using 'w' meaning 'working', as a prefix instead of 'temp' or 'tmp', for local variables that are just there to jockey data around, like:
Public Function ArrayFromDJRange(rangename As Range, slots As Integer) As Variant
' this function copies a Disjoint Range of specified size into a Variant Array 7/8/09 ljr
Dim j As Integer
Dim wArray As Variant
Dim rCell As Range
wArray = rangename.Value ' to initialize the working Array
ReDim wArray(0, slots - 1) ' set to size of range
j = 0
For Each rCell In rangename
wArray(0, j) = rCell.Value
j = j + 1
Next rCell
ArrayFromDJRange = wArray
End Function

What's bad about the VB With/End With keyword?

In this question, a user commented to never use the With block in VB. Why?
"Never" is a strong word.
I think it fine as long as you don't abuse it (like nesting)
IMHO - this is better:
With MyCommand.Parameters
.Count = 1
.Item(0).ParameterName = "#baz"
.Item(0).Value = fuz
End With
Than:
MyCommand.Parameters.Count = 1
MyCommand.Parameters.Item(0).ParameterName = "#baz"
MyCommand.Parameters.Item(0).Value = fuz
There is nothing wrong about the With keyword. It's true that it may reduce readibility when nested but the solution is simply don't use nested With.
There may be namespace problems in Delphi, which doesn't enforce a leading dot but that issue simply doesn't exist in VB.NET so the people that are posting rants about Delphi are losing their time in this question.
I think the real reason many people don't like the With keyword is that is not included in C* languages and many programmers automatically think that every feature not included in his/her favourite language is bad.
It's just not helpful compared to other options.
If you really miss it you can create a one or two character alias for your object instead. The alias only takes one line to setup, rather than two for the With block (With + End With lines).
The alias also gives you a quick mouse-over reference for the type of the variable. It provides a hook for the IDE to help you jump back to the top of the block if you want (though if the block is that large you have other problems). It can be passed as an argument to functions. And you can use it to reference an index property.
So we have an alternative that gives more function with less code.
Also see this question:
Why is the with() construct not included in C#, when it is really cool in VB.NET?
The with keyword is only sideswiped in a passing reference here in an hilarious article by the wonderful Verity Stob, but it's worth it for the vitriol: See the paragraph that starts
While we are on identifier confusion. The with keyword...
Worth reading the entire article!
The With keyword also provides another benefit - the object(s) in the With statement only need to be "qualified" once, which can improve performance. Check out the information on MSDN here:
http://msdn.microsoft.com/en-us/library/wc500chb(VS.80).aspx
So by all means, use it.