I recently heard of the anti-if campaign and the efforts of some OOP gurus to write code without ifs, but using polymorphism instead. I just don't get how that should work, I mean, how it should ALWAYS work.
I already use polymorphism (didn't know about anti-if campaign), so, I was curious about "bad" and "dangerous" ifs and I went to see my code (Java/Swift/Objective-C) to see where I use if most, and it looks like these are the cases:
Check of null values. This is the most common situation where I ever use ifs. If a value could possibly be null, I have to manage it in a correct way. To use it, instead I have to check that it's not null. I don't see how polymorphism could compensate this without ifs.
Check for right values. I'll do an example here: Let's suppose that I have a login/signup application. I want to check that user did actually write a password, or that it's longer than 5 characters. How could it possibly be done without if/switches? Again, it's not about the type but about the value.
(optional) check errors. Optional because it's similar to point 2 about right values. If I get either a value or an error (passed as params) in a block/closure, how can I handle the error object if I just can't check if it's null or isn't?
If you know more about this campaign, please answer in scope of that. I'm asking this to understand their purposes and how effectively it could be done what they say.
So, I know not using ifs at all may not be the smartest idea ever, but just asking if and how it could effectively be done in an OOP program.
You'll never completely get rid of ifs, but you can minimize them.
Regarding null value checks, a method that would otherwise return a null value can return a Null Object instead, an object that doesn't represent a real value but implements some of the same behavior as a real value. Its callers can just call methods on the Null Object instead of checking to see if it's null. There is probably still an if inside the method, but there don't need to be any in the callers.
Regarding correct value checks, the best path is to prevent an object from being instantiated with incorrect attributes. All users of the object can then be confident that they don't have to inspect the object's attributes to use it. Similarly, if an object can have an attribute that is valid or invalid, it can hide that from its users by providing higher-level methods that do the right thing for the current attribute value. Again, there is still a if inside the object, but there don't need to be any in the callers.
Regarding error checks, there are a couple of strategies that are better than returning a possibly null error value that the caller might forget to check. One is raising an exception. Another is to return an object of a type that can hold either a result or an error and provides type-safe, if-free ways to operate on either result when appropriate, like Java's Optional or Haskell's Maybe.
Note also that case statements are just concatenated ifs (in fact I'd have written the code on the campaign's home page with a switch rather than if/else if), and there are also patterns which replace case with polymorphism, such as the Strategy pattern.
This is a great question and is something that's asked at every OO bootcamp I've been a part of. To begin with, we need to understand why code with a lot of ifs is 'bad' or 'dangerous':
they increase the cyclomatic complexity of the code, making it hard to follow/understand.
they make tests more complicated to write. Ensuring that you test each branch flow in the method under test becomes increasingly more difficult with each conditional and makes test setup cumbersome.
they could be a sign that your code has not been broken into small enough methods
they could be a sign that your methods have not been encapsulated well
However, there is one important thing to remember - ifs cannot(and should not) be eliminated from the code completely. But, we can generally abstract them away using techniques like polymorphism, extracting small behaviours, and encapsulating these behaviours into the appropriate classes.
Now that we know some of the reasons why we should avoid ifs, let's tackle your questions:
Checking for null values: The Null object pattern helps you eliminate null checks from your code(polymorphism FTW). Instead of returning null, you return a Special Case NullObject representation of the expected object. This NullObject has the same interfaces as your actual object and you can safely call any of the object's methods without worrying about a null pointer exception being thrown.
Checking for correctness of values: There are a lot of ways to do this. For example, you could create a separate ValidationRule class for each of your validations and then chain calls to them together when you want to validate your object. Notice that the ifs still remain, but they get abstracted away into the individual ValidationRule implementations. Look up the Command pattern and the Chain Of Responsibility pattern for ideas.
It's better to use if to check the null instead of raising an exception. Also in common cases checking for null helps us to prevent operations with non-initialized variables.
Using switch plus SOLID. Other thinks inherited from this.
Related
Mono.just(null) will not compile. Why is that?
On a procedural level I get it. It does not make sense to have a processing queue without something to process. Can someone phrase this for me with some more technical depth?
There are high risks when letting null into an application/library, and if you can ban it, one should.
Null is like letting a bomb into your application, if something potentially can be null, then something can explode with a NullPointerException at any time.
null creates an enormous uncertainty in an application at all times. You should always clean out null as early as possible.
People solve this by doing null checks, everywhere, which are basically redundant operations.
Sir. Tony Hoare - The inventor of null famously claimed that null was his:
billion dollar mistake.
There are plenty of programming languages out there that doesn't have null, here is a long list:
List of languages without null
Having null in a stream makes no sense as it also means that the reactor library would have to do null checks all over in their code to make sure that they don't try to do an operation on a null value. Everyone would have to do null checks in every flatMap, map, filter, operator etc because there could be a potential NullPointerException in all those operators.
So they had the opportunity to exclude null, which makes perfect sense, as null is not a value, and only values can be transported in streams, so that's probably why they decided against it.
Instead they decided to use a type Void to represent "nothing" which can be obtained by calling Mono<Void> nothing = Mono.empty(); which is type safe, will not explode if you touch it, and behave as you expect, since the developers control it, and not the runtime.
I would instead ask myself, why do you actually need null? i'm guessing because you have gotten used to it, aad coding out of habit is bad.
95% of the times i have seen null used, it could have been removed.
Learn instead how to code without null, make it a habit to use default values, fallback values etc. it will most likely make your program more robust, and more deterministic.
As stipulated by Reactive-Streams
Calling onSubscribe, onNext, onError or onComplete MUST return normally except when any provided parameter is null in which case it MUST throw a java.lang.NullPointerException to the caller, for all other
And Reactor is based on the Reactive Streams.
Personally, I don’t like this rule, but the rule is not set by me. Since I want to use it, I can only follow it.
I guess Flux is the core of the Reactor,Mono is incidental. For streams, banning NULL is probably a reasonable option although it would be inconvenient for Mono.
The reason has already been explained by others, and Mono.justOrEmpty(null) exists to make you life a little easier, if you want the null to result in an empty Mono, but it needs to be a deliberate decision by you.
Let’s say I have a method that populates a list with some kind of objects. What are the advantages and disadvantages of following method designs?
void populate (ArrayList<String> list, other parameters ...)
ArrayList<String> populate(other parameters ...)
Which one I should prefer?
This looks like a general issue about method design but I couldn't find a satisfying answer on google, probably for not using the right keywords.
The second one seems more functional and thread safe to me. I'd prefer it in most cases. (Like every rule, there are exceptions.)
The owner of the populate method could return an immutable List (why ArrayList?).
It's also thread safe if there is no state modified in the populate method. Only passed in parameters are used, and these can also be immutable.
Other than what #duffymo mentioned, the second one is easier to understand, thus use: it is obvious what its input and output is.
Advantages to the in-out parameter:
You don't have to create as many objects. In languages like C or C++, where allocation and deallocation can be expensive, that can be a plus. In Java/C#, not so much -- GC makes allocation cheap and deallocation all but invisible, so creating objects isn't as big a deal. (You still shouldn't create them willy-nilly, but if you need one, the overhead isn't as bad as in some manual-allocation languages.)
You get to specify the type of the list. Potential plus if you need to pass that array to some other code you don't control later.
Disadvantages:
Readability issues.
In almost all languages that support function arguments, the first case is assumed to mean "do something with the entries in this list". Modifying args violates the Priciple of Least Astonishment. The second is assumed to mean "give me a list of stuff", which is what you're after.
Every time you say "ArrayList", or even "List", you take away a bit of flexibility. You add some overhead to your API. What if i don't want to create an ArrayList before calling your method? I shouldn't have to, if the method's whole purpose in life is to return me some entries. That's the API's job.
Encapsulation issues:
The method being passed a list to fill can't assume anything about that list (even that it's a list at all; it could be null).
The method passing the list can't guarantee anything about what the method does with it. If it's working correctly, sure, the API docs can say "this method won't destroy existing entries". But considering the chance of bugs, that may not be worth trusting. At least if the method returns its own list, the caller doesn't have to worry about what was in it before. And it doesn't have to worry about a bug from a thousand miles away corrupting data it should never have affected.
Thread safety issues.
The list could be locked by another thread, meaning if we try and lock on it now it could potentially lock up the app.
Or, if not locked, it could still be modified by another thread, in which case we're no less screwed. Unless you're going to write extra code to handle concurrent-modification exceptions everywhere.
Returning a new list means every call to the method can have its own list. No thread can mess with another thread's return value, unless the class is very badly designed.
Side point: Being able to specify the type of the list often leads to dependencies on the type of the list. Notice how you're passing ArrayLists around everywhere. You're painting yourself into corners by saying "This is an ArrayList" when you don't need to, but when you're passing it to a dozen methods, that's a dozen methods you'll have to change. (Not entirely related, but only slightly tangential. You could change the types to List rather than ArrayList and get rid of this. But the more you're passing that list around, the more places you'll need to change.)
Short version: Unless you have a damn good reason, use the first syntax only if you're using the existing contents of the list in your method. IE: if you're modifying it, or doing something with the existing values. If you intend to return a list of entries, then return a List of entries.
The second method is the preferred way for many reasons.
primarily because the function signature is more clear and shows what its intentions are.
It is actually recommended that you NEVER change the value of a parameter that is passed in to a function unless you explicitly mark it as an "out" parameter.
it will also be easier to use in expressions
and it will be easier to change in the future. including taking it to a more functional approach (for threading, etc.) if you would like to
I have been learning Objective-C as my first language and understand Classes, Objects, instances, methods, OOP in general, etc enough to use the language and make simple applications work, but I wanted to check on a few fundamental questions that have never been explained in examples I followed.
I think the questions are so simple that they will confuse a lot of people, but I hope it will make sense to someone out there.
(While learning Objective-C the authors are assuming I have a basic computer programming background, yet I have found that a basic computer programming background is hard to come by since everyone teaching computer programming assumes you already have one to start teaching you something else. Hence the help with the fundamentals)
Passing and Returning:
When declaring methods with parameters how is the parameter stuff actually working if the arguments being passed into the parameters can have different names then the parameter names? I hope that makes sense. I know parameter names are variables for that very reason, but...
are the arguments themselves getting mapped to a look up table or something?
Second the argument "types" (int for example) have to match the parameter return types in order for them to be passed into the method, and you always have to make your arguments values equal the parameter names somewhere else in your code listing before passing them into the method?
Is the following correct: After a method gets executed it returns a particular value (if it is not void) to the class or instances that is calling the method in the first place.
Is object oriented programming really just passing "your" Objects instance methods around with the system generated classes and methods to produce a result? If we are passing things to methods so they can do some work to them and then return something back why not do the work in the first place eliminating the need to pass anything? Theoretical question I guess? I assume the answer would be: Because that would be a crazy big tangled mess of a method with everything happening all at once, but I wanted to ask anyway.
Thank you for your time.
Variables are just places where values are stored. When you pass a variable as an argument, you aren't actually passing the variable itself — you're passing the value of the variable, which is copied into the argument. There's no mapping table or anything — it just takes the value of the variable and sticks it in the argument.
In the case of objects, the variable is a pointer to an object that exists somewhere in the program's memory space. In this case, the value of the pointer is copied just like any other variable, but it still points to the same object.
(the argument "types" … have to match the parameter return types…) It isn't technically true that the types have to be the same, though they usually should be. Some types can be automatically converted to another type. For example, a char or short will be promoted to an int if you pass them to a function or method that takes an int. There's a complicated set of rules around type conversions. One thing you usually should not do is use casts to shut up compiler warnings about incompatible types — the compiler takes that to mean, "It's OK, I know what I'm doing," even if you really don't. Also, object types cannot ever be converted this way, since the variables are just pointers and the objects themselves live somewhere else. If you assign the value of an NSString*variable to an NSArray* variable, you're just lying to the compiler about what the pointer is pointing to, not turning the string into an array.
Non-void functions and methods return a value to the place where they're called, yes.
(Is object-oriented programming…) Object-oriented programming is a way of structuring your program so that it can be conceptually described as a collection of objects sending messages to each other and doing things in response to those messages.
(why not do the work in the first place eliminating the need to pass anything) The primary problem in computer programming is writing code that humans can understand and improve later. Functions and methods allow us to break our code into manageable chunks that we can reason about. They also allow us to write code once and reuse it all over the place. If we didn't factor repeated code into functions, then we'd have to repeat the code every time it is needed, which both makes the program code much longer and introduces thousands of new opportunities for bugs to creep in. 50,000-line programs would become 500 million-line programs. Not only would the program be horrendously bug-ridden, but it would be such a huge ball of spaghetti that finding the bugs would be a Herculean task.
By the way, I think you might like Uli Kusterer's Masters of the Void. It's a programming tutorial for Mac users who don't know anything about programming.
"If we are passing things to methods so they can do some work to them and then return something back why not do the work in the first place eliminating the need to pass anything?"
In the beginning, that's how it was done.
But then smart programers noticed that they were repeating copies of some work and also running out of memory, so they decided to put that chunk of work in one central place to save memory, and then call it by passing in the data from where it was before.
They gave the locations, where the data was stuffed, names, because the programs were big enough that nobody memorized all the numerical address for every bit of data any more.
Then really really big computers finally got more 16k of memory, and the programs started to become big unmanageable messes, so they codified the practice as part of structured programming. It's now a religious tenet.
But it's still done, by compilers when the inline flag is set, and also sometimes by hand on code that has to be really really fast on some very constrained processors by programmers who know when and where to make targeted trade-offs.
A little reading on the History of Computers is quite informative about how we got to where we are today, and why we do such strange things.
All that type checks used (at most) only during compilation stage, to fix errors in code.
Actually, during execution, all variables are just a block of memory, which is sent somewhere. For example, 'id' type and 'int' are both represented as 4-byte raw value, and you can write (int)id and (id)int to convert those type one to another.
And, about parameters names - they are used by compiler only to let it know, to which memory area send some data.
That's easy explanation, actually all that stuff is complicated, but I think you'll get the main idea - during execution there are no variable names/types, everything is done via operations over memory blocks.
I have the habit of always validating property setters against bad data, even if there's no where in my program that would reasonably input bad data. My QA person doesn't want me throwing exceptions unless I can explain where they would occur. Should I be validating all properties? Is there a standard on this I could point to?
Example:
public void setName(String newName){
if (newName == null){
throw new IllegalArgumentException("Name cannot be null");
}
name = newName;
}
...
//Only call to setName(String)
t.setName("Jim");
You're enforcing your method's preconditions, which are an important part of its contract. There's nothing wrong with doing that, and it also serves as self-documenting code (if I read your method's code, I immediately see what I shouldn't pass to it), though asserts may be preferable for that.
Personally I prefer using Asserts in these wildly improbable cases just to avoid difficult to read code but to make it clear that assumptions are being made in the function's algorithms.
But, of course, this is very much a judgement call that has to be made on a case-by-case basis. You can see it (and I have seen it) get completely out of hand - to the point where a simple function is turned into a tangle of if statements that pretty much never evaluate to true.
You are doing ok !
Whether it's a setter or a function - always validate and throw meaningfull exception. you never know when you'll need it, and you will...
In general I don't favor this practice. It's not that performing validation is bad, but rather, on things like simple setters it tends to create more clutter than its worth in protecting from bugs. I prefer using unit tests to insure there are no bugs.
Well, it's been awhile since this question was posted but I'd like to give a different point of view on this topic.
Using the specific example you posted, IMHO you should doing validation, but in a different way.
The key to archieving validation lies in the question itself. Think about it: you're dealing with names, not strings.
A string is a name when it's not null. We can also think of additional characteristics that make a string a name: is cannot be empty nor contain spaces.
Suppose you need to add those validation rules: if you stick with your approach you'll end up cluttering your setter as #SingleShot said.
Also, what would you do if more than one domain object has a setName setter?
Even if you use helper classes as #dave does, code will still be duplicated: calls to the helper instances.
Now, think for a moment: what if all the arguments you could ever receive in the setName method were valid? No validation would be needed for sure.
I might sound too optimistic, but it can be done.
Remember you're dealing with names, so why don't model the concept of a name?
Here's a vanilla, dummy implementation to show the idea:
public class Name
public static Name From(String value) {
if (string.IsNullOrEmpty(value)) throw new ...
if (value.contains(' ')) throw new ...
return new Name(value);
}
private Name(string value) {
this.value = value;
}
// other Name stuff goes here...
}
Because validation is happening at the moment of creation, you can only get valid Name instances. There's no way to create a Name instance from an "invalid" string.
Not only the validation code has been centralized, but also exceptions are thrown in a context that have meaning to them (the creation of a Name instance).
You can read about great design principles in Hernan Wilkinson's "Design Principles Behind Patagonia" (the name example is taken from it). Be sure to check the ESUG 2010 Video and the presentation slides
Finally, I think you might find Jim Shore's "Fail Fast" article interesting.
It's a tradeoff. It's more code to write, review and maintain, but you'll probably find problems quicker if somehow a null name gets through.
I lean towards having it because eventually you find you do need it.
I used to have utility classes to keep the code to a minimum. So instead of
if (name == null) { throw new ...
you could have
Util.assertNotNull(name)
Then Java added asserts to the language and you could do it more directly. And turn it off if you wanted.
It's well done in my opinion. For null values throw IllegalArgumentException. For other kind of validations you should consider using a customized exceptions hierarchy related to your domain objects.
I'm not aware of any documented standard that says 'validate all user input' but it's a very good idea. In the current version of the program it may not be possible to access this particular property with invalid data but that's not going to prevent it from happening in the future. All sorts of interesting things happen during maintenance. And hey, you never know when someone will reuse the class in another application that doesn't validate data before passing it in.
I often find myself needing reference to an object that is several objects away, or so it seems. The options I see are passing a reference through a middle-man or just making something available statically. I understand the danger of global scope, but passing a reference through an object that does nothing with it feels ridiculous. I'm okay with a little bit passing around, I suppose. I suspect there's a line to be drawn somewhere.
Does anyone have insight on where to draw this line?
Or a good way to deal with the problem of distributing references amongst dependent objects?
Use the Law of Demeter (with moderation and good taste, not dogmatically). If you're coding a.b.c.d.e, something IS wrong -- you've nailed forevermore the implementation of a to have a b which has a c which... EEP!-) One or at the most two dots is the maximum you should be using. But the alternative is NOT to plump things into globals (and ensure thread-unsafe, buggy, hard-to-maintain code!), it is to have each object "surface" those characteristics it is designed to maintain as part of its interface to clients going forward, instead of just letting poor clients go through such undending chains of nested refs!
This smells of an abstraction that may need some improvement. You seem to be violating the Law of Demeter.
In some cases a global isn't too bad.
Consider, you're probably programming against an operating system's API. That's full of globals, you can probably access a file or the registry, write to the console. Look up a window handle. You can do loads of stuff to access state that is global across the whole computer, or even across the internet... and you don't have to pass a single reference to your class to access it. All this stuff is global if you access the OS's API.
So, when you consider the number of global things that often exist, a global in your own program probably isn't as bad as many people try and make out and scream about.
However, if you want to have very nice OO code that is all unit testable, I suppose you should be writing wrapper classes around any access to globals whether they come from the OS, or are declared yourself to encapsulate them. This means you class that uses this global state can get references to the wrappers, and they could be replaced with fakes.
Hmm, anyway. I'm not quite sure what advice I'm trying to give here, other than say, structuring code is all a balance! And, how to do it for your particular problem depends on your preferences, preferences of people who will use the code, how you're feeling on the day on the academic to pragmatic scale, how big the code base is, how safety critical the system is and how far off the deadline for completion is.
I believe your question is revealing something about your classes. Maybe the responsibilities could be improved ? Maybe moving some code would solve problems ?
Tell, don't ask.
That's how it was explained to me. There is a natural tendency to call classes to obtain some data. Taken too far, asking too much, typically leads to heavy "getter sequences". But there is another way. I must admit it is not easy to find, but improves gradually in a specific code and in the coder's habits.
Class A wants to perform a calculation, and asks B's data. Sometimes, it is appropriate that A tells B to do the job, possibly passing some parameters. This could replace B's "getName()", used by A to check the validity of the name, by an "isValid()" method on B.
"Asking" has been replaced by "telling" (calling a method that executes the computation).
For me, this is the question I ask myself when I find too many getter calls. Gradually, the methods encounter their place in the correct object, and everything gets a bit simpler, I have less getters and less call to them. I have less code, and it provides more semantic, a better alignment with the functional requirement.
Move the data around
There are other cases where I move some data. For example, if a field moves two objects up, the length of the "getter chain" is reduced by two.
I believe nobody can find the correct model at first.
I first think about it (using hand-written diagrams is quick and a big help), then code it, then think again facing the real thing... Then I code the rest, and any smells I feel in the code, I think again...
Split and merge objects
If a method on A needs data from C, with B as a middle man, I can try if A and C would have some in common. Possibly, A or a part of A could become C (possible splitting of A, merging of A and C) ...
However, there are cases where I keep the getters of course.
But it's less likely a long chain will be created.
A long chain will probably get broken by one of the techniques above.
I have three patterns for this:
Pass the necessary reference to the object's constructor -- the reference can then be stored as a data member of the object, and doesn't need to be passed again; this implies that the object's factory has the necessary reference. For example, when I'm creating a DOM, I pass the element name to the DOM node when I construct the DOM node.
Let things remember their parent, and get references to properties via their parent; this implies that the parent or ancestor has the necessary property. For example, when I'm creating a DOM, there are various things which are stored as properties of the top-level DomDocument ancestor, and its child nodes can access those properties via the reference which each one has to its parent.
Put all the different things which are passed around as references into a single class, and then pass around just that one class instance as the only thing that's passed around. For example, there are many properties required to render a DOM (e.g. the GDI graphics handle, the viewport coordinates, callback events, etc.) ... I put all of these things into a single 'Context' instance which is passed as the only parameter to the methods of the DOM nodes to be rendered, and each method can get whichever properties it needs out of that context parameter.