Should I give up grammatical correctness when naming my functions to offer regularity? - naming-conventions

I implement several global functions in our library that look something like this:
void init_time();
void init_random();
void init_shapes();
I would like to add functions to provide a check whether those have been called:
bool is_time_initialized();
bool is_random_initialized();
bool are_shapes_initialized();
However, as you can see are_shapes_initialized falls out of the row due to the fact that shapes is plural and therefore the function name must start with are and not is. This could be a problem, as the library is rather large and not having a uniform way to group similiar functions under the same naming convention might be confusing / upsetting.
E.g. a user using IntelliSense quickly looking up function names to see if the libary offers a way to check if their initialization call happened:
They won't find are_shapes_initialized() here unless scrolling through hundreds of additional function / class names.
Just going with is_shapes_initialized() could offer clarity:
As this displays all functions, now.
But how can using wrong grammar be a good approach? Shouldn't I just assume that the user should also ask IntelliSense for "are_initialized" or just look into the documentation in the first place? Probably not, right? Should I just give up on grammatical correctness?

The way I see it, a variable is a single entity. Maybe that entity is an aggregate of other entities, such as an array or a collection, in which case it would make sense to give it a plural name e.g. a set of Shape objects could be called shapes. Even so, it is still a single object. Looking at it that way, it is grammatically acceptable to refer to it as singular. After all, is_shapes_initialized actually means "Is the variable 'shapes' initialized?"
It's the same reason we say "The Bahamas is" or "The Netherlands is", because we are referring to the singular country, not whatever plural entity it is comprised of. So yes, is_shapes_initialized can be considered grammatically correct.

It's more a matter of personal taste. I would recommend putting "is" before functions that return Boolean. This would look more like:
bool is_time_initialized();
bool is_random_initialized();
bool is_shapes_initialized();
This makes them easier to find and search for, even if they aren't grammatically correct.
You can find functions using "are" to show it is plural in places such as the DuckDuckGo app, with:
areItemsTheSame(...)
areContentsTheSame(...)
In the DuckDuckGo app, it also uses "is" to show functions return boolean, and boolean variables:
val isFullScreen: Boolean = false
isAssignableFrom(...)
In OpenTK, a C# Graphics Library, I also found usage of "are":
AreTexturesResident(...)
AreProgramsResident(...)
In the same OpenTK Libary, they use "is" singularly for functions that return boolean and boolean variables:
IsEnabledGenlock(...)
bool isControl = false;
Either usage could work. Using "are" plurally would make more sense grammatically, and using "if" plurally could make more sense for efficiency or simplifying Boolean functions.

Here's what I would do, assuming you are trying to avoid calling this function on each shape.
void init_each_shape();
bool is_each_shape_initialized();
Also assuming that you need these functions, it seems like it would make more sense to have the functions throw an exception if they do not succeed.

Related

Variable shorthands

Not really related to programming in general.
I don't know if this is a struggle for anybody, and there's probably an easy solution to this problem, but I couldn't find any elegant solutions.
Everyone knows that if you're a programmer, you should write readable code to save time and help yourself in the future, like giving variables better names instead of s or n.
For Example:
public void doSomething(Function<> functionToDo, int numberOfTimes)
instead of:
public void doIt(Function<> f, int n)
But sometimes, if I have a long variable name and I have to type it in an equation that makes me have to scroll right to see the whole thing, that can get frustrating.
So, my question is: Is there any way I can define a shortcut variable that doesn't affect runtime or memory?
like c++'s pre-proccesor statement #define: #define n numberOfTimes
Or, if there isn't solution to this at all, should I keep long variable names for the readability, or keep things short instead?
Any ideas are appreciated.
It's all about the context where an identifier is declared. For instance, if your function doIt was named doNTimes it would be perfectly fine to name the parameters f and n. Also, they are local to the function so you don't need to search for their documentation (which should be just before or after the function header). As you mention, in choosing a name there is also a tradeoff between identifier comprehensibility and expression comprehensibility; whereas a more descriptive name increases the former and decreases the latter the opposite holds true for a short name.
If you know that your identifier is going to be used in complex expressions it's a good idea to use a shorter name. A function call with side-effects on the other hand will (should) only be a single statement so then the name can be longer.
To summarize, I would say that it's a good idea to keep formal parameters and local variables short as that make expressions easy to comprehend; the documentation is right there in the function anyway, e.g.
public void doNTimes(Function<> f, int n); /** apply f n times */
Note: In a real scenario you would also need to provide the actual parameters of f.

Should I use an explicit return type for a String variable in Kotlin?

In Kotlin, We can declare a string read-only variable with type assignment and without type assignment (inferred) as below.
val variable_name = "Hello world"
or
val variable_name: String = "Hello world"
I'm trying to figure out what is the best in Kotlin and why it is the best way. Any idea?
If this is a public variable, using an explicit return type is always a good idea.
It can make the code easier to read and use. This is why your IDE probably shows the return type anyway, even when you omit it from the code. It's less important for simple properties like yours where the return type is easy to see at a glance, but when the property or method is more than a few lines it makes much more difference.
It prevents you from accidentally changing the type. With an explicit return type, if you change the contents of the property so that it doesn't actually return the correct type, you'll get an immediate compile error in that method or property. With an implicit type, if you change the contents of the method you could see cascading errors throughout your code base, making it hard to find the source of the error.
It can actually speed up your IDE! See this blog post from the JetBrains team for more information.
For private variables, explicit return types are much less important, because the above points don't generally apply.
Personally either one works and for me nothing is wrong, but I would choose the later if this is a team project, where project size increase and feature inheritance(members leaving, new hiring or worse shuffling people) is probable. Also I consider the later as more of a courtesy.
There are situations where regardless of the dogma every member follows, such as clean architecture, design-patterns or clean-coding, bloated codes or files are always expected to occur in such big projects occasionally, so the later would help anyone especially new members to easily recognize at first glance what data type they are dealing with.
Again this this is not about right or wrong, as kotlin is created to be idiomatic, I think this is Autoboxing, it was done in kotlin for codes to be shorter and cleaner as few of its many promise, but again regardless of the language, sometimes its the developer's discretion to have a readable code or not.
This also applies with function return types, I always specify my function return types just so the "new guy" or any other developer will understand my function signatures right away, saving him tons of brain cells understanding whats going on.
fun isValidEmail() : Boolean = if (condition) true else false
fun getValidatedPerson(): Person = repository.getAuthenticatedPersonbyId(id)
fun getCurrentVisibleScreen(): #Composable ()-> Unit = composables.get()
fun getCurrentContext(): Context if (isActivity) activityContext else applicationContext

Using If...Else vs If()

Does cleanliness trump performance here:
Version 1:
Function MyFunc(ByVal param as String) As String
Dim returnValue as String
If param Is Nothing Then
returnValue = "foo"
Else
returnValue = param
return returnValue
Version 2:
Function MyFunc(ByVal param as String) As String
return If(param,"foo")
Version 1 deals directly with unboxed Strings. Version 2 deals with all boxed Objects. [If() takes a TestExpression as Object, a FalsePart as Object and returns an Object]
[can't add comments]
COMMENT: ja72, fixed my naming.
COMMENT: Marc, so you would go with Version 2?
I think clarity trumps anything.
The If(obj1,obj2) function is the null coalescing operator of VB.NET. It functions the same as obj1 ?? obj2 in C#. As such, everyone should know what it means, and it should be used where conciseness is important.
Although the If/Else statement is clean, simple, and obvious, in this particular case, I would favor the If function.
The compiler would optimize these two to the same code or nearly the same depending on the optimization level (See project properties).
Write two methods this way, compile them and use Reflector to look into the VB.Net decompiled code (or even MSIL) and you will see that there is very little (some billionth of a second) or none difference in exectuion.
Compiler optimizations generally handle normal patterns that allows you to write if-statements and loops in different ways. For instance in .Net for, foreach, while, do, etc do not actually exist. They are language specific features that are compiled down to goto-statement logic in the assembly level. Use Reflector to look at a few of these and you'll learn a lot! :)
Note that it is possible to write bad code that the compiler can't optimize to its "best state", and it is even possible to do better than the compiler. Understanding .Net assembly and MSIL means understanding the compiler better.
Really? I don't think this function is going to be a bottleneck in any application, and so just go with brevity/clarity.
I would recommend:
Public Function TXV(ByVal param As String) As String
Return If(param Is Nothing, "foo", param)
End Function
and make sure the function returns a string (to keep type safety). BTW, why is your Function called MySub ? Shouldn't it be MyFunc ?
I believe that these two implementations are nearly the same, I would use the second one because it's shorter.
Since I come from a C background, I would opt for the ternary operator most times where it is clear what is happening - in a case like this where there is repetition and it can be idiomatic. Similarly in T-SQL where you can use COALESCE(a, b, c, d, e) to avoid having a bunch of conditionals and simply take the first non-null value - this is idiomatic and easily read.
Beware that the old IIf function is different from the new If operator, so while the new one properly handles side-effects and short-circuits, it's only one character away from a completely different behavior which people have long been wary of.
http://secretgeek.net/iif_function.asp
http://visualbasic.about.com/od/usingvbnet/a/ifop.htm
I don't think it's going to matter in terms of performance, because the optimizer is pretty good about these kind of transforms.

Boolean method naming readability

Simple question, from a readability standpoint, which method name do you prefer for a boolean method:
public boolean isUserExist(...)
or:
public boolean doesUserExist(...)
or:
public boolean userExists(...)
public boolean userExists(...)
Would be my prefered. As it makes your conditional checks far more like natural english:
if userExists ...
But I guess there is no hard and fast rule - just be consistent
I would say userExists, because 90% of the time my calling code will look like this:
if userExists(...) {
...
}
and it reads very literally in English.
if isUserExist and if doesUserExist seem redundant.
Beware of sacrificing clarity whilst chasing readability.
Although if (user.ExistsInDatabase(db)) reads nicer than if (user.CheckExistsInDatabase(db)), consider the case of a class with a builder pattern, (or any class which you can set state on):
user.WithName("Mike").ExistsInDatabase(db).ExistsInDatabase(db2).Build();
It's not clear if ExistsInDatabase is checking whether it does exist, or setting the fact that it does exist. You wouldn't write if (user.Age()) or if (user.Name()) without any comparison value, so why is if (user.Exists()) a good idea purely because that property/function is of boolean type and you can rename the function/property to read more like natural english? Is it so bad to follow the same pattern we use for other types other than booleans?
With other types, an if statement compares the return value of a function to a value in code, so the code looks something like:
if (user.GetAge() >= 18) ...
Which reads as "if user dot get age is greater than or equal to 18..." true - it's not "natural english", but I would argue that object.verb never resembled natural english and this is simply a basic facet of modern programming (for many mainstream languages). Programmers generally don't have a problem understanding the above statement, so is the following any worse?
if (user.CheckExists() == true)
Which is normally shortened to
if (user.CheckExists())
Followed by the fatal step
if (user.Exists())
Whilst it has been said that "code is read 10x more often than written", it is also very important that bugs are easy to spot. Suppose you had a function called Exists() which causes the object to exist, and returns true/false based on success. You could easily see the code if (user.Exists()) and not spot the bug - the bug would be very much more obvious if the code read if (user.SetExists()) for example.
Additionally, user.Exists() could easily contain complex or inefficient code, round tripping to a database to check something. user.CheckExists() makes it clear that the function does something.
See also all the responses here: Naming Conventions: What to name a method that returns a boolean?
As a final note - following "Tell Don't Ask", a lot of the functions that return true/false disappear anyway, and instead of asking an object for its state, you tell it to do something, which it can do in different ways based on its state.
The goal for readability should always be to write code the closest possible to natural language. So in this case, userExists seems the best choice. Using the prefix "is" may nonetheless be right in another situations, for example isProcessingComplete.
My simple rule to this question is this:
If the boolean method already HAS a verb, don't add one. Otherwise, consider it. Some examples:
$user->exists()
$user->loggedIn()
$user->isGuest() // "is" added
I would go with userExists() because 1) it makes sense in natural language, and 2) it follows the conventions of the APIs I have seen.
To see if it make sense in natural language, read it out loud. "If user exists" sounds more like a valid English phrase than "if is user exists" or "if does user exist". "If the user exists" would be better, but "the" is probably superfluous in a method name.
To see whether a file exists in Java SE 6, you would use File.exists(). This looks like it will be the same in version 7. C# uses the same convention, as do Python and Ruby. Hopefully, this is a diverse enough collection to call this a language-agnostic answer. Generally, I would side with naming methods in keeping with your language's API.
There are things to consider that I think were missed by several other answers here
It depends if this is a C++ class method or a C function. If this is a method then it will likely be called if (user.exists()) { ... } or if (user.isExisting()) { ... }
not if (user_exists(&user)) .
This is the reason behind coding standards that state bool methods should begin with a verb since they will read like a sentence when the object is in front of them.
Unfortunately lots of old C functions return 0 for success and non-zero for failure so it can be difficult to determine the style being used unless you follow the all bool functions begin with verbs or always compare to true like so if (true == user_exists(&user))
Why not rename the property then?
if (user.isPresent()) {
Purely subjective.
I prefer userExists(...) because then statements like this read better:
if ( userExists( ... ) )
or
while ( userExists( ... ) )
In this particular case, the first example is such horrible English that it makes me wince.
I'd probably go for number three because of how it sounds when reading it in if statements. "If user exists" sounds better than "If does user exists".
This is assuming it's going to be to used in if statement tests of course...
I like any of these:
userExists(...)
isUserNameTaken(...)
User.exists(...)
User.lookup(...) != null
Method names serves for readability, only the ones fit into your whole code would be the best which most of the case it begins with conditions thus subjectPredicate follows natural sentence structure.
Since I follow the convention to put verb before function name, I would do the same here too:
//method name
public boolean doesExists(...)
//this way you can also keep a variable to store the result
bool userExists = user.doesExists()
//and use it like a english phrase
if (userExists) {...}
//or you can use the method name directly also and it will make sense here too
if (user.doesExists()) {...}

Separate Namespaces for Functions and Variables in Common Lisp versus Scheme

Scheme uses a single namespace for all variables, regardless of whether they are bound to functions or other types of values. Common Lisp separates the two, such that the identifier "hello" may refer to a function in one context, and a string in another.
(Note 1: This question needs an example of the above; feel free to edit it and add one, or e-mail the original author with it and I will do so.)
However, in some contexts, such as passing functions as parameters to other functions, the programmer must explicitly distinguish that he's specifying a function variable, rather than a non-function variable, by using #', as in:
(sort (list '(9 A) '(3 B) '(4 C)) #'< :key #'first)
I have always considered this to be a bit of a wart, but I've recently run across an argument that this is actually a feature:
...the
important distinction actually lies in the syntax of forms, not in the
type of objects. Without knowing anything about the runtime values
involved, it is quite clear that the first element of a function form
must be a function. CL takes this fact and makes it a part of the
language, along with macro and special forms which also can (and must)
be determined statically. So my question is: why would you want the
names of functions and the names of variables to be in the same
namespace, when the primary use of function names is to appear where a
variable name would rarely want to appear?
Consider the case of class names: why should a class named FOO prevent
the use of variables named FOO? The only time I would be referring the
class by the name FOO is in contexts which expect a class name. If, on
the rare occasion I need to get the class object which is bound to the
class name FOO, there is FIND-CLASS.
This argument does make some sense to me from experience; there is a similar case in Haskell with field names, which are also functions used to access the fields. This is a bit awkward:
data Point = Point { x, y :: Double {- lots of other fields as well --} }
isOrigin p = (x p == 0) && (y p == 0)
This is solved by a bit of extra syntax, made especially nice by the NamedFieldPuns extension:
isOrigin2 Point{x,y} = (x == 0) && (y == 0)
So, to the question, beyond consistency, what are the advantages and disadvantages, both for Common Lisp vs. Scheme and in general, of a single namespace for all values versus separate ones for functions and non-function values?
The two different approaches have names: Lisp-1 and Lisp-2. A Lisp-1 has a single namespace for both variables and functions (as in Scheme) while a Lisp-2 has separate namespaces for variables and functions (as in Common Lisp). I mention this because you may not be aware of the terminology since you didn't refer to it in your question.
Wikipedia refers to this debate:
Whether a separate namespace for functions is an advantage is a source of contention in the Lisp community. It is usually referred to as the Lisp-1 vs. Lisp-2 debate. Lisp-1 refers to Scheme's model and Lisp-2 refers to Common Lisp's model. These names were coined in a 1988 paper by Richard P. Gabriel and Kent Pitman, which extensively compares the two approaches.
Gabriel and Pitman's paper titled Technical Issues of Separation in Function Cells and Value Cells addresses this very issue.
Actually, as outlined in the paper by Richard Gabriel and Kent Pitman, the debate is about Lisp-5 against Lisp-6, since there are several other namespaces already there, in the paper are mentioned type names, tag names, block names, and declaration names. edit: this seems to be incorrect, as Rainer points out in the comment: Scheme actually seems to be a Lisp-1. The following is largely unaffected by this error, though.
Whether a symbol denotes something to be executed or something to be referred to is always clear from the context. Throwing functions and variables into the same namespace is primarily a restriction: the programmer cannot use the same name for a thing and an action. What a Lisp-5 gets out of this is just that some syntactic overhead for referencing something from a different namespace than what the current context implies is avoided. edit: this is not the whole picture, just the surface.
I know that Lisp-5 proponents like the fact that functions are data, and that this is expressed in the language core. I like the fact that I can call a list "list" and a car "car" without confusing my compiler, and functions are a fundamentally special kind of data anyway. edit: this is my main point: separate namespaces are not a wart at all.
I also liked what Pascal Constanza had to say about this.
I've met a similar distinction in Python (unified namespace) vs Ruby (distinct namespaces for methods vs non-methods). In that context, I prefer Python's approach -- for example, with that approach, if I want to make a list of things, some of which are functions while others aren't, I don't have to do anything different with their names, depending on their "function-ness", for example. Similar considerations apply to all cases in which function objects are to be bandied around rather than called (arguments to, and return values from, higher-order functions, etc, etc).
Non-functions can be called, too (if their classes define __call__, in the case of Python -- a special case of "operator overloading") so the "contextual distinction" isn't necessarily clear, either.
However, my "lisp-oid" experience is/was mostly with Scheme rather than Common Lisp, so I may be subconsciously biased by the familiarity with the uniform namespace that in the end comes from that experience.
The name of a function in Scheme is just a variable with the function as its value. Whether I do (define x (y) (z y)) or (let ((x (lambda (y) (z y)))), I'm defining a function that I can call. So the idea that "a variable name would rarely want to appear there" is kind of specious as far as Scheme is concerned.
Scheme is a characteristically functional language, so treating functions as data is one of its tenets. Having functions be a type of their own that's stored like all other data is a way of carrying on the idea.
The biggest downside I see, at least for Common Lisp, is understandability. We can all agree that it uses different namespaces for variables and functions, but how many does it have? In PAIP, Norvig showed that it has "at least seven" namespaces.
When one of the language's classic books, written by a highly respected programmer, can't even say for certain in a published book, I think there's a problem. I don't have a problem with multiple namespaces, but I wish the language was, at the least, simple enough that somebody could understand this aspect of it entirely.
I'm comfortable using the same symbol for a variable and for a function, but in the more obscure areas I resort to using different names out of fear (colliding namespaces can be really hard to debug!), and that really should never be the case.
There's good things to both approaches. However, I find that when it matters, I prefer having both a function LIST and a a variable LIST than having to spell one of them incorrectly.