How could you improve this code design? - oop

Lately, I've been making use a lot of the Strategy Pattern along with the Factory Pattern. And I really mean a lot. I have a lot of "algorithms" for everything and factories that retrieve algorithms based on parameters.
Even though the code seems very extensible, and it is, having N factories seems a bit of an abuse.
I know this is pretty subjective, and we're talking without seeing code, but is this acceptable in real world code? Would you change something ?

OK- ask yourself a question. does/will this algorithm implementation ever change? If no then remove the strategy.

I am maintence.
I once was forced (by my pattern lovin' boss) to write a set of 16 "buffer interpreter tuxedo services" using an AbstractFactory and a double DAO pattern in C++ (no reflections, no code gen). All up it something like 20,000 lines of the nastiest code I've even seen (not the least because I didn't really know C++ when I started) and it took about three months.
Since my old boss has moved on I've rewritten them using good 'ole "straight up and down" procedural style C++, with couple of funky-macros... each service is like 60 lines of code, times 16... all up less than a 1000 lines of really SIMPLE code; so simple that even I can follow it.
Cheers. Keith.

Whenever I'm implementing code in this fashion, some questions I ask are:
what components do I need to substitute to test ?
what components will I expect users/admins to disable or substitute (e.g. via Spring configs or similar) ?
what components do I expect or suspect will not be required in the future due to (possibly) changing requirements ?
This all drives how I construct object or components (via factories) and how I implement algorithms. The above is vague, but (of course) the requirements can be similarly difficult to pin down. Without seeing your implementation and your requirements, I can't comment further, but the above should act as some guideline to help you determine whether what you've done is overkill.

If you're using the same design pattern all over the place, perhaps you should either switch to a language that has better support for what you're trying to do or rethink your code to be more idiomatic in your language of choice. After all, that's why we have more than one programming language.

Would depends on the kind of software I'm working on.
Maintenance asks for simple code and factories is NOT simple code.
But extensibility asks sometimes for factories...
So you have to take both in consideration.
Just have in mind that most of the time, you will have to maintain a source file MANY times a year and you will NOT have to extend it.
IMO, patterns should only be used when absolutely needed. If you think it can be handy in two years, you are better to use them... in two years.

How complex is the work the factory is handling? Does object creation really need to be abstracted to a different class? A variation of the factory method is having a simple, in-class factory. This really works best if any dependencies have already been injected.
For instance,
public class Customer
{
public Customer CreateNewCustomer()
{
// handle minimally complex create logic here
}
}
As far as Strategy overuse... Again, as #RichardOD explained, will the algorithm ever really change?
Always keep in mind the YAGNI principle. You Aren't Gonna Need It.

Can't you make an AbstractFactory instead off different standalone factories?
AlgorithmFactory creates the algorithms you need based on the concrete factory.

Related

How to understand the big picture in a loose coupled application?

We have been developing code using loose coupling and dependency injection.
A lot of "service" style classes have a constructor and one method that implements an interface. Each individual class is very easy to understand in isolation.
However, because of the looseness of the coupling, looking at a class tells you nothing about the classes around it or where it fits in the larger picture.
It's not easy to jump to collaborators using Eclipse because you have to go via the interfaces. If the interface is Runnable, that is no help in finding which class is actually plugged in. Really it's necessary to go back to the DI container definition and try to figure things out from there.
Here's a line of code from a dependency injected service class:-
// myExpiryCutoffDateService was injected,
Date cutoff = myExpiryCutoffDateService.get();
Coupling here is as loose as can be. The expiry date be implemented literally in any manner.
Here's what it might look like in a more coupled application.
ExpiryDateService = new ExpiryDateService();
Date cutoff = getCutoffDate( databaseConnection, paymentInstrument );
From the tightly coupled version, I can infer that the cutoff date is somehow determined from the payment instrument using a database connection.
I'm finding code of the first style harder to understand than code of the second style.
You might argue that when reading this class, I don't need to know how the cutoff date is figured out. That's true, but if I'm narrowing in on a bug or working out where an enhancement needs to slot in, that is useful information to know.
Is anyone else experiencing this problem? What solutions have you? Is this just something to adjust to? Are there any tools to allow visualisation of the way classes are wired together? Should I make the classes bigger or more coupled?
(Have deliberately left this question container-agnostic as I'm interested in answers for any).
While I don't know how to answer this question in a single paragraph, I attempted to answer it in a blog post instead: http://blog.ploeh.dk/2012/02/02/LooseCouplingAndTheBigPicture.aspx
To summarize, I find that the most important points are:
Understanding a loosely coupled code base requires a different mindset. While it's harder to 'jump to collaborators' it should also be more or less irrelevant.
Loose coupling is all about understanding a part without understanding the whole. You should rarely need to understand it all at the same time.
When zeroing in on a bug, you should rely on stack traces rather than the static structure of the code in order to learn about collaborators.
It's the responsibility of the developers writing the code to make sure that it's easy to understand - it's not the responsibility of the developer reading the code.
Some tools are aware of DI frameworks and know how to resolve dependencies, allowing you to navigate your code in a natural way. But when that isn't available, you just have to use whatever features your IDE provides as best you can.
I use Visual Studio and a custom-made framework, so the problem you describe is my life. In Visual Studio, SHIFT+F12 is my friend. It shows all references to the symbol under the cursor. After a while you get used to the necessarily non-linear navigation through your code, and it becomes second-nature to think in terms of "which class implements this interface" and "where is the injection/configuration site so I can see which class is being used to satisfy this interface dependency".
There are also extensions available for VS which provide UI enhancements to help with this, such as Productivity Power Tools. For instance, you can hover over an interface, a info box will pop up, and you can click "Implemented By" to see all the classes in your solution implementing that interface. You can double-click to jump to the definition of any of those classes. (I still usually just use SHIFT+F12 anyway).
I just had an internal discussion about this, and ended up writing this piece, which I think is too good not to share. I'm copying it here (almost) unedited, but even though it's part of a bigger internal discussion, I think most of it can stand alone.
The discussion is about introduction of a custom interface called IPurchaseReceiptService, and whether or not it should be replaced with use of IObserver<T>.
Well, I can't say that I have strong data points about any of this - it's just some theories that I'm pursuing... However, my theory about cognitive overhead at the moment goes something like this: consider your special IPurchaseReceiptService:
public interface IPurchaseReceiptService
{
void SendReceipt(string transactionId, string userGuid);
}
If we keep it as the Header Interface it currently is, it only has that single SendReceipt method. That's cool.
What's not so cool is that you had to come up with a name for the interface, and another name for the method. There's a bit of overlap between the two: the word Receipt appears twice. IME, sometimes that overlap can be even more pronounced.
Furthermore, the name of the interface is IPurchaseReceiptService, which isn't particularly helpful either. The Service suffix is essentially the new Manager, and is, IMO, a design smell.
Additionally, not only did you have to name the interface and the method, but you also have to name the variable when you use it:
public EvoNotifyController(
ICreditCardService creditCardService,
IPurchaseReceiptService purchaseReceiptService,
EvoCipher cipher
)
At this point, you've essentially said the same thing thrice. This is, according to my theory, cognitive overhead, and a smell that the design could and should be simpler.
Now, contrast this to use of a well-known interface like IObserver<T>:
public EvoNotifyController(
ICreditCardService creditCardService,
IObserver<TransactionInfo> purchaseReceiptService,
EvoCipher cipher
)
This enables you to get rid of the bureaucracy and reduce the design the the heart of the matter. You still have intention-revealing naming - you only shift the design from a Type Name Role Hint to an Argument Name Role Hint.
When it comes to the discussion about 'disconnectedness', I'm under no illusion that use of IObserver<T> will magically make this problem go away, but I have another theory about this.
My theory is that the reason many programmers find programming to interfaces so difficult is exactly because they are used to Visual Studio's Go to definition feature (incidentally, this is yet another example of how tooling rots the mind). These programmers are perpetually in a state of mind where they need to know what's 'on the other side of an interface'. Why is this? Could it be because the abstraction is poor?
This ties back to the RAP, because if you confirm programmers' belief that there's a single, particular implementation behind every interface, it's no wonder they think that interfaces are only in the way.
However, if you apply the RAP, I hope that slowly, programmers will learn that behind a particular interface, there may be any implementation of that interface, and their client code must be able to handle any implementation of that interface without changing the correctness of the system. If this theory holds, we've just introduced the Liskov Substitution Principle into a code base without scaring anyone with high-brow concepts they don't understand :)
However, because of the looseness of the coupling, looking at a class
tells you nothing about the classes around it or where it fits in the
larger picture.
This is not accurate.For each class you know exactly what kind of objects the class depends on, to be able to provide its functionality at runtime.
You know them since you know that what objects are expected to be injected.
What you don't know is the actual concrete class that will be injected at runtime which will implement the interface or base class that you know your class(es) depend on.
So if you want to see what is the actual class injected, you just have to look at the configuration file for that class to see the concrete classes that are injected.
You could also use facilities provided by your IDE.
Since you refer to Eclipse then Spring has a plugin for it, and has also a visual tab displaying the beans you configure. Did you check that? Isn't it what you are looking for?
Also check out the same discussion in Spring Forum
UPDATE:
Reading your question again, I don't think that this is a real question.
I mean this in the following manner.
Like all things loose coupling is not a panacea and has its own disadvantages per se.
Most tend to focus on the benefits but as any solution it has its disadvantages.
What you do in your question is describe one of its main disadvantages which is that it indeed is not easy to see the big picture since you have everything configurable and plugged in by anything.
There are other drawbacks as well that one could complaint e.g. that it is slower than tight coupled applications and still be true.
In any case, re-iterating, what you describe in your question is not a problem you stepped upon and can find a standard solution (or any for that manner).
It is one of the drawbacks of loose coupling and you have to decide if this cost is higher than what you actually gain by it, like in any design-decision trade off.
It is like asking:
Hey I am using this pattern named Singleton. It works great but I can't create new objects!How can I get arround this problem guys????
Well you can't; but if you need to, perhaps singleton is not for you....
One thing that helped me is placing multiple closely related classes in the same file. I know this goes against the general advice (of having 1 class per file) and I generally agree with this, but in my application architecture it works very well. Below I will try to explain in which case this is.
The architecture of my business layer is designed around the concept of business commands. Command classes (simple DTO with only data and no behavior) are defined and for each command there is a 'command handler' that contains the business logic to execute this command. Each command handler implements the generic ICommandHandler<TCommand> interface, where TCommand is the actual business command.
Consumers take a dependency on the ICommandHandler<TCommand> and create new command instances and use the injected handler to execute those commands. This looks like this:
public class Consumer
{
private ICommandHandler<CustomerMovedCommand> handler;
public Consumer(ICommandHandler<CustomerMovedCommand> h)
{
this.handler = h;
}
public void MoveCustomer(int customerId, Address address)
{
var command = new CustomerMovedCommand();
command.CustomerId = customerId;
command.NewAddress = address;
this.handler.Handle(command);
}
}
Now consumers only depend on a specific ICommandHandler<TCommand> and have no notion of the actual implementation (as it should be). However, although the Consumer should know nothing about the implementation, during development I (as a developer) am very much interested in the actual business logic that is executed, simply because development is done in vertical slices; meaning that I'm often working on both the UI and business logic of a simple feature. This means I'm often switching between business logic and UI logic.
So what I did was putting the command (in this example the CustomerMovedCommand and the implementation of ICommandHandler<CustomerMovedCommand>) in the same file, with the command first. Because the command itself is concrete (since its a DTO there is no reason to abstract it) jumping to the class is easy (F12 in Visual Studio). By placing the handler next to the command, jumping to the command means also jumping to the business logic.
Of course this only works when it is okay for the command and handler to be living in the same assembly. When your commands need to be deployed separately (for instance when reusing them in a client/server scenario), this will not work.
Of course this is just 45% of my business layer. Another big peace however (say 45%) are the queries and they are designed similarly, using a query class and a query handler. These two classes are also placed in the same file which -again- allows me to navigate quickly to the business logic.
Because the commands and queries are about 90% of my business layer, I can in most cases move very quickly from presentation layer to business layer and even navigate easily within the business layer.
I must say these are the only two cases that I place multiple classes in the same file, but makes navigation a lot easier.
If you want to learn more about how I designed this, I've written two articles about this:
Meanwhile... on the command side of my architecture
Meanwhile... on the query side of my architecture
In my opinion, loosely coupled code can help you much but I agree with you about the readability of it.
The real problem is that name of methods also should convey valuable information.
That is the Intention-Revealing Interface principle as stated by
Domain Driven Design ( http://domaindrivendesign.org/node/113 ).
You could rename get method:
// intention revealing name
Date cutoff = myExpiryCutoffDateService.calculateFromPayment();
I suggest you to read thoroughly about DDD principles and your code could turn much more readable and thus manageable.
I have found The Brain to be useful in development as a node mapping tool. If you write some scripts to parse your source into XML The Brain accepts, you could browse your system easily.
The secret sauce is to put guids in your code comments on each element you want to track, then the nodes in The Brain can be clicked to take you to that guid in your IDE.
Depending on how many developers are working on projects and whether you want to reuse some parts of it in different projects loose coupling can help you a lot. If your team is big and project needs to span several years, having loose coupling can help as work can be assigned to different groups of developers more easily. I use Spring/Java with lots of DI and Eclipse offers some graphs to display dependencies. Using F3 to open class under cursor helps a lot. As stated in previous posts, knowing shortcuts for your tool will help you.
One other thing to consider is creating custom classes or wrappers as they are more easily tracked than common classes that you already have (like Date).
If you use several modules or layer of application it can be a challenge to understand what a project flow is exactly, so you might need to create/use some custom tool to see how everything is related to each other. I have created this for myself, and it helped me to understand project structure more easily.
Documentation !
Yes, you named the major drawback of loose coupled code. And if you probably already realized that at the end, it will pay off, it's true that it will always be longer to find "where" to do your modifications, and you might have to open few files before finding "the right spot"...
But that's when something really important: the documentation. It's weird that no answer explicitly mentioned that, it's a MAJOR requirement in all big sized development.
API Documentation
An APIDoc with a good search feature. That each file and --almost-- each methods have a clear description.
"Big picture" documentation
I think it's good to have a wiki that explain the big picture. Bob have made a proxy system ? How doest it works ? Does it handle authentication ? What kind of component will use it ? Not a whole tutorial, but just a place when you can read 5 minutes, figure out what components are involved and how they are linked together.
I do agree with all the points of Mark Seemann answer, but when you get in a project for the first time(s), even if you understand well the principles behing decoupling, you'll either need a lot of guessing, or some sort of help to figure out where to implement a specific feature you want to develop.
... Again: APIDoc and a little developper Wiki.
I am astounded that nobody has written about the testability (in terms of unit testing of course) of the loose coupled code and the non-testability (in the same terms) of the tightly coupled design! It is no brainer which design you should choose. Today with all the Mock and Coverage frameworks it is obvious, well, at least for me.
Unless you do not do unit tests of your code or you think you do them but in fact you don't...
Testing in isolation can be barely achieved with tight coupling.
You think you have to navigate through all the dependencies from your IDE? Forget about it! It is the same situation as in case of compilation and runtime. Hardly any bug can be found during the compilation, you cannot be sure whether it works unless you test it, which means execute it. Want to know what is behind the interface? Put a breakpoint and run the goddamn application.
Amen.
...updated after the comment...
Not sure if it is going to serve you but in Eclipse there is something called hierarchy view. It shows you all the implementations of an interface within your project (not sure if the workspace as well). You can just navigate to the interface and press F4. Then it will show you all the concrete and abstract classes implementing the interface.

Compromising design & code quality to integrate with existing modules

Greetings!
I inherited a C#.NET application I have been extending and improving for a while now. Overall it was obviously a rush-job (or whoever wrote it was seemingly less competent than myself). The app pulls some data from an embedded device & displays and manipulates it. At the core is a communications thread in the main application form which executes a 600+ lines of code method which calls functions all over the place, implementing a state machine - lots of if-state-then-do type code. Interaction with the device is done by setting the state/mode globally and letting the thread do it's thing. (This is just one example of the badness of the code - overall it is not very OO-like, it reminds of the style of embedded C code the device firmware is written in).
My problem is that this piece of code is central to the application. The software, communications protocol or device firmware are not documented at all. Obviously to carry on with my work I have to interact with this code.
What I would like some guidance on, is whether it is worth scrapping this code & trying to piece together something more reasonable from the information I can reverse engineer? I can't decide! The reason I don't want to refactor is because the code already works, and changing it will surely be a long, laborious and unpleasant task. On the flip side, not refactoring means I have to sometimes compromise the design of other modules so that I may call my code from this state machine!
I've heard of "If it ain't broke don't fix it!", so I am wondering if it should apply when "it" is influencing the design of future code! Any advice would be appreciated!
Thanks!
Also, the longer you wait, the worse the codebase will smell. My suggestion would be first create a testsuite that you can evaluate your refactoring against. This makes it a lot easier to see if you are refactoring or just plain breaking things :).
I would definitely recommend you to refactor the code if you feel its junky. Yes, during the process of refactoring you may have some inconsistencies/problems at the start. But that is why we have iterations and testing. Since you are going to build up on this core engine in future, why not make the basement as stable as possible.
However, be very sure on what you are going to do. Because at times long lines of code does not necessarily mean evil. On the other hand they may be very efficient in running time. If/else blocks are not bad if you ask me, as they are very intelligent in branching from a microprocessor's perspective. So, you will have to be judgmental and very clear before you touch this.
But once you refactor the code, you will definitely have fine control over it. And don't forget to document it!! Tomorrow, someone might very well come and say about you on whatever you've told about this guy who have written that core code.
This depends on the constraints you are facing, it's a decision to be based on practical basis, not on theoretical ones. You need three things to consider.
Time: you need to have enough time to learn it, implement it, and test it, without too many other tasks interrupting you
Boss #1: if you are working for someone, he needs to know and approve the time and effort you will spend immediately, required to rebuild your solution
Boss #2: your boss also needs to know that the advantage of having new and clean software will come at the price of possible regressions, and therefore at the beginning of the deployment there may be unexpected bugs
If you have those three, then go ahead and refactor it. It will be surely be worth it!
First and foremost, get all the business logic out of the Form. Second, locate all the parts where the code interacts with the global state (e.g. accessing the embedded system). Delegate all this access to methods. Then, move these methods into a new class and create an instance in the class's constructor. Finally, inject an instance for the class to use.
Following these steps, you can move your embedded system logic ("existing module") to a wrapper class you write, so the interface can be nice and clean and more manageable. Then you can better tackle refactoring the monster method because there is less global state to worry about (only local state).
If the code works and you can integrate your part with minimal changes to it then let the code as it is and do your integration.
If the code is simply a big barrier in your way to add new functionality then it is best for you to refactor it.
Talk with other people that are responsible for the project, explain the situation, give an estimation explaining the benefits gained after refactoring the code and I'm sure (I hope) that the best choice will be made. It is best to speak about what you think, don't keep anything inside, especially if this affects your productivity, motivation etc.
NOTE: Usually rewriting code is out of the question but depending on situation and amount of code needed to be rewritten the decision may vary.
You say that this is having an impact on the future design of the system. In this case I would say it is broken and does need fixing.
But you do have to take into account the business requirements. Often reality gets in the way!
Would it be possible to wrap this code up in another class whose interface better suits how you want to take the system forward? (See adapter pattern)
This would allow you to move forward with your requirements without the poor design having an impact.
It gives you an interface that you understand which you could write some unit tests for. These tests can be based on what your design requires from this code. It ensures that your assumptions about what it is doing is correct. If you say that this code works, then any failing tests may be that your assumptions are incorrect.
Once you have these tests you can safely refactor - one step at a time, and when you have some spare time or when it is needed - as per business requirements.
Quite often I find the best way to truly understand a piece of code is to refactor it.
EDIT
On reflection, as this is one big method with multiple calls to the outside world, you are going to need some kind of inverse Adapter class to wrap this method. If you can inject dependencies into the method (see Dependency Inversion such that the method calls methods in your classes then you can route these to the original calls.

How do you write good highly useful general purpose libraries?

I asked this question about Microsoft .NET Libraries and the complexity of its source code. From what I'm reading, writing general purpose libraries and writing applications can be two different things. When writing libraries, you have to think about the client who could literally be everyone (supposing I release the library for use in the general public).
What kind of practices or theories or techniques are useful when learning to write libraries? Where do you learn to write code like the one in the .NET library? This looks like a "black art" which I don't know too much about.
That's a pretty subjective question, but here's on objective answer. The Framework Design Guidelines book (be sure to get the 2nd edition) is a very good book about how to write effective class libraries. The content is very good and the often dissenting annotations are thought-provoking. Every shop should have a copy of this book available.
You definitely need to watch Josh Bloch in his presentation How to Design a Good API & Why it Matters (1h 9m long). He is a Java guru but library design and object orientation are universal.
One piece of advice often ignored by library authors is to internalize costs. If something is hard to do, the library should do it. Too often I've seen the authors of a library push something hard onto the consumers of the API rather than solving it themselves. Instead, look for the hardest things and make sure the library does them or at least makes them very easy.
I will be paraphrasing from Effective C++ by Scott Meyers, which I have found to be the best advice I got:
Adhere to the principle of least astonishment: strive to provide classes whose operators and functions have a natural syntax and an intuitive semantics. Preserve consistency with the behavior of the built-in types: when in doubt, do as the ints do.
Recognize that anything somebody can do, they will do. They'll throw exceptions, they'll assign objects to themselves, they'll use objects before giving them values, they'll give objects values and never use them, they'll give them huge values, they'll give them tiny values, they'll give them null values. In general, if it will compile, somebody will do it. As a result, make your classes easy to use correctly and hard to use incorrectly. Accept that clients will make mistakes, and design your classes so you can prevent, detect, or correct such errors.
Strive for portable code. It's not much harder to write portable programs than to write unportable ones, and only rarely will the difference in performance be significant enough to justify unportable constructs.
Even programs designed for custom hardware often end up being ported, because stock hardware generally achieves an equivalent level of performance within a few years. Writing portable code allows you to switch platforms easily, to enlarge your client base, and to brag about supporting open systems. It also makes it easier to recover if you bet wrong in the operating system sweepstakes.
Design your code so that when changes are necessary, the impact is localized. Encapsulate as much as you can; make implementation details private.
Edit: I just noticed I very nearly duplicated what cherouvim had posted; sorry about that! But turns out we're linking to different speeches by Bloch, even if the subject is exactly the same. (cherouvim linked to a December 2005 talk, I to January 2007 one.) Well, I'll leave this answer here — you're probably best off by watching both and seeing how his message and way of presenting it has evolved :)
FWIW, I'd like to point to this Google Tech Talk by Joshua Bloch, who is a greatly respected guy in the Java world, and someone who has given speeches and written extensively on API design. (Oh, and designed some exceptionally good general purpose libraries, like the Java Collections Framework!)
Joshua Bloch, Google Tech Talks, January 24, 2007:
"How To Design A Good API and Why it
Matters" (the video is about 1 hour long)
You can also read many of the same ideas in his article Bumper-Sticker API Design (but I still recommend watching the presentation!)
(Seeing you come from the .NET side, I hope you don't let his Java background get in the way too much :-) This really is not Java-specific for the most part.)
Edit: Here's another 1½ minute bit of wisdom by Josh Bloch on why writing libraries is hard, and why it's still worth putting effort in it (economies of scale) — in a response to a question wondering, basically, "how hard can it be". (Part of a presentation about the Google Collections library, which is also totally worth watching, but more Java-centric.)
Krzysztof Cwalina's blog is a good starting place. His book, Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries, is probably the definitive work for .NET library design best practices.
http://blogs.msdn.com/kcwalina/
The number one rule is to treat API design just like UI design: gather information about how your users really use your UI/API, what they find helpful and what gets in their way. Use that information to improve the design. Start with users who can put up with API churn and gradually stabilize the API as it matures.
I wrote a few notes about what I've learned about API design here: http://www.natpryce.com/articles/000732.html
I'd start looking more into design patterns. You'll probably not going to find much use for some of them, but as you get deeper into your library design the patterns will become more applicable. I'd also pick up a copy of NDepend - a great code measuring utility which may help you decouple things better. You can use .NET libraries as an example, but, personally, i don't find them to be great design examples mostly due to their complexities. Also, start looking at some open source projects to see how they're layered and structured.
A couple of separate points:
The .NET Framework isn't a class library. It's a Framework. It's a set of types meant to not only provide functionality, but to be extended by your own code. For instance, it does provide you with the Stream abstract class, and with concrete implementations like the NetworkStream class, but it also provides you the WebRequest class and the means to extend it, so that WebRequest.Create("myschema://host/more") can produce an instance of your own class deriving from WebRequest, which can have its own GetResponse method returning its own class derived from WebResponse, such that calling GetResponseStream will return your own class derived from Stream!
And your callers will not need to know this is going on behind the scenes!
A separate point is that for most developers, creating a reusable library is not, and should not be the goal. The goal should be to write the code necessary to meet requirements. In the process, reusable code may be found. In that case, it should be refactored out into a separate library, where it can be reused in the future.
I go further than that (when permitted). I will usually wait until I find two pieces of code that actually do the same thing, or which overlap. Presumably both pieces of code have passed all their unit tests. I will then factor out the common code into a separate class library and run all the unit tests again. Assuming that they still pass, I've begun the creation of some reusable code that works (since the unit tests still pass).
This is in contrast to a lesson I learned in school, when the result of an entire project was a beautiful reusable library - with no code to reuse it.
(Of course, I'm sure it would have worked if any code had used it...)

Anyone else find naming classes and methods one of the most difficult parts in programming? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
So I'm working on this class that's supposed to request help documentation from a vendor through a web service. I try to name it DocumentRetriever, VendorDocRequester, DocGetter, but they just don't sound right. I ended up browsing through dictionary.com for half an hour trying to come up with an adequate word.
Start programming with bad names is like having a very bad hair day in the morning, the rest of the day goes downhill from there. Feel me?
What you are doing now is fine, and I highly recommend you stick with your current syntax, being:
context + verb + how
I use this method to name functions/methods, SQL stored procs, etc. By keeping with this syntax, it will keep your Intellisense/Code Panes much more neat. So you want EmployeeGetByID() EmployeeAdd(), EmployeeDeleteByID(). When you use a more grammatically correct syntax such as GetEmployee(), AddEmployee() you'll see that this gets really messy if you have multiple Gets in the same class as unrelated things will be grouped together.
I akin this to naming files with dates, you want to say 2009-01-07.log not 1-7-2009.log because after you have a bunch of them, the order becomes totally useless.
One lesson I have learned, is that if you can't find a name for a class, there is almost always something wrong with that class:
you don't need it
it does too much
A good naming convention should minimize the number of possible names you can use for any given variable, class, method, or function. If there is only one possible name, you'll never have trouble remembering it.
For functions and for singleton classes, I scrutinize the function to see if its basic function is to transform one kind of thing into another kind of thing. I'm using that term very loosely, but you'll discover that a HUGE number of functions that you write essentially take something in one form and produce something in another form.
In your case it sounds like your class transforms a Url into a Document. It's a little bit weird to think of it that way, but perfectly correct, and when you start looking for this pattern, you'll see it everywhere.
When I find this pattern, I always name the function xFromy.
Since your function transforms a Url into a Document, I would name it
DocumentFromUrl
This pattern is remarkably common. For example:
atoi -> IntFromString
GetWindowWidth -> WidthInPixelsFromHwnd // or DxFromWnd if you like Hungarian
CreateProcess -> ProcessFromCommandLine
You could also use UrlToDocument if you're more comfortable with that order. Whether you say xFromy or yTox is probably a matter of taste, but I prefer the From order because that way the beginning of the function name already tells you what type it returns.
Pick one convention and stick to it. If you are careful to use the same names as your class names in your xFromy functions, it'll be a lot easier to remember what names you used. Of course, this pattern doesn't work for everything, but it does work where you're writing code that can be thought of as "functional."
Sometimes there isn't a good name for a class or method, it happens to us all. Often times, however, the inability to come up with a name may be a hint to something wrong with your design. Does your method have too many responsibilities? Does your class encapsulate a coherent idea?
Thread 1:
function programming_job(){
while (i make classes){
Give each class a name quickly; always fairly long and descriptive.
Implement and test each class to see what they really are.
while (not satisfied){
Re-visit each class and make small adjustments
}
}
}
Thread 2:
while(true){
if (any code smells bad){
rework, rename until at least somewhat better
}
}
There's no Thread.sleep(...) anywhere here.
I do spend a lot of time as well worrying about the names of anything that can be given a name when I am programming. I'd say it pays off very well though. Sometimes when I am stuck I leave it for a while and during a coffee break I ask around a bit if someone has a good suggestion.
For your class I'd suggest VendorHelpDocRequester.
The book Code Complete by Steve Mcconnell has a nice chapter on naming variables/classes/functions/...
I think this is a side effect.
It's not the actual naming that's hard. What's hard is that the process of naming makes you face the horrible fact that you have no idea what the hell you're doing.
I actually just heard this quote yesterday, through the Signal vs. Noise blog at 37Signals, and I certainly agree with it:
"There are only two hard things in Computer Science: cache invalidation and naming things."
— Phil Karlton
It's good that it's difficult. It's forcing you to think about the problem, and what the class is actually supposed to do. Good names can help lead to good design.
Agreed. I like to keep my type names and variables as descriptive as possible without being too horrendously long, but sometimes there's just a certain concept that you can't find a good word for.
In that case, it always helps me to ask a coworker for input - even if they don't ultimately help, it usually helps me to at least explain it out loud and get my wheels turning.
I was just writing on naming conventions last month: http://caseysoftware.com/blog/useful-naming-conventions
The gist of it:
verbAdjectiveNounStructure - with Structure and Adjective as optional parts
For verbs, I stick to action verbs: save, delete, notify, update, or generate. Once in a while, I use "process" but only to specifically refer to queues or work backlogs.
For nouns, I use the class or object being interacted with. In web2project, this is often Tasks or Projects. If it's Javascript interacting with the page, it might be body or table. The point is that the code clearly describes the object it's interacting with.
The structure is optional because it's unique to the situation. A listing screen might request a List or an Array. One of the core functions used in the Project List for web2project is simply getProjectList. It doesn't modify the underlying data, just the representation of the data.
The adjectives are something else entirely. They are used as modifiers to the noun. Something as simple as getOpenProjects might be easily implemented with a getProjects and a switch parameter, but this tends to generate methods which require quite a bit of understanding of the underlying data and/or structure of the object... not necessarily something you want to encourage. By having more explicit and specific functions, you can completely wrap and hide the implementation from the code using it. Isn't that one of the points of OO?
More so than just naming a class, creating an appropriate package structure can be a difficult but rewarding challenge. You need to consider separating the concerns of your modules and how they relate to the vision of the application.
Consider the layout of your app now:
App
VendorDocRequester (read from web service and provide data)
VendorDocViewer (use requester to provide vendor docs)
I would venture to guess that there's a lot going on inside a few classes. If you were to refactor this into a more MVC-ified approach, and allow small classes to handle individual duties, you might end up with something like:
App
VendorDocs
Model
Document (plain object that holds data)
WebServiceConsumer (deal with nitty gritty in web service)
Controller
DatabaseAdapter (handle persistance using ORM or other method)
WebServiceAdapter (utilize Consumer to grab a Document and stick it in database)
View
HelpViewer (use DBAdapter to spit out the documention)
Then your class names rely on the namespace to provide full context. The classes themselves can be inherently related to application without needing to explicitly say so. Class names are simpler and easier to define as a result!
One other very important suggestion: please do yourself a favor and pick up a copy of Head First Design Patterns. It's a fantastic, easy-reading book that will help you organize your application and write better code. Appreciating design patterns will help you to understanding that many of the problems you encounter have already been solved, and you'll be able to incorporate the solutions into your code.
Leo Brodie, in his book "Thinking Forth", wrote that the most difficult task for a programmer was naming things well, and he stated that the most important programming tool is a thesaurus.
Try using the thesaurus at http://thesaurus.reference.com/.
Beyond that, don't use Hungarian Notation EVER, avoid abbreviations, and be consistent.
Best wishes.
In short:
I agree that good names are important, but I don't think you have to find them before implementing at all costs.
Of course its better to have a good name right from the start. But if you can't come up with one in 2 minutes, renaming later will cost less time and is the right choice from a productivity point of view.
Long:
Generally it's often not worth to think too long about a name before implementing. If you implement your class, naming it "Foo" or "Dsnfdkgx", while implementing you see what you should have named it.
Especially with Java+Eclipse, renaming things is no pain at all, as it carefully handles all references in all classes, warns you of name collisions, etc. And as long as the class is not yet in the version control repository, I don't think there's anything wrong with renaming it 5 times.
Basically, it's a question of how you think about refactoring. Personally, I like it, though it annoys my team mates sometimes, as they believe in never touch a running system. And from everything you can refactor, changing names is one of the most harmless things you can do.
Why not HelpDocumentServiceClient kind of a mouthful, or HelpDocumentClient...it doesn't matter it's a vendor the point is it's a client to a webservice that deals with Help documents.
And yes naming is hard.
There is only one sensible name for that class:
HelpRequest
Don't let the implementation details distract you from the meaning.
Invest in a good refactoring tool!
I stick to basics: VerbNoun(arguments). Examples: GetDoc(docID).
There's no need to get fancy. It will be easy to understand a year from now, whether it's you or someone else.
For me I don't care how long a method or class name is as long as its descriptive and in the correct library. Long gone are the days where you should remember where each part of the API resides.
Intelisense exists for all major languages. Therefore when using a 3rd party API I like to use its intelisense for the documentation as opposed to using the 'actual' documentation.
With that in mind I am fine to create a method name such as
StevesPostOnMethodNamesBeingLongOrShort
Long - but so what. Who doesnt use 24inch screens these days!
I have to agree that naming is an art. It gets a little easier if your class is following a certain "desigh pattern" (factory etc).
This is one of the reasons to have a coding standard. Having a standard tends to assist coming up with names when required. It helps free up your mind to use for other more interesting things! (-:
I'd recommend reading the relevant chapter of Steve McConnell's Code Complete (Amazon link) which goes into several rules to assist readability and even maintainability.
HTH
cheers,
Rob
Nope, debugging is the most difficult thing thing for me! :-)
DocumentFetcher? It's hard to say without context.
It can help to act like a mathematician and borrow/invent a lexicon for your domain as you go: settle on short plain words that suggest the concept without spelling it out every time. Too often I see long latinate phrases that get turned into acronyms, making you need a dictionary for the acronyms anyway.
The language you use to describe the problem, is the language you should use for the variables, methods, objects, classes, etc. Loosely, nouns match objects and verbs match methods. If you're missing words to describe the problem, you're also missing a full understanding (specification) of the problem.
If it's just choosing between a set of names, then it should be driven by the conventions you are using to build the system. If you've come to a new spot, uncovered by previous conventions, then it's always worth spending some effort on trying extend them (properly, consistently) to cover this new case.
If in doubt, sleep on it, and pick the first most obvious name, the next morning :-)
If you wake up one day and realize you were wrong, then change it right away.
Paul.
BTW: Document.fetch() is pretty obvious.
I find I have the most trouble in local variables. For example, I want to create an object of type DocGetter. So I know it's a DocGetter. Why do I need to give it another name? I usually end up giving it a name like dg (for DocGetter) or temp or something equally nondescriptive.
Don't forget design patterns (not just the GoF ones) are a good way of providing a common vocabulary and their names should be used whenever one fits the situation. That will even help newcomers that are familiar with the nomenclature to quickly understand the architecture. Is this class you're working on supposed to act like a Proxy, or even a Façade ?
Shouldn't the vendor documentation be the object? I mean, that one is tangible, and not just as some anthropomorphization of a part of your program. So, you might have a VendorDocumentation class with a constructor that fetches the information. I think that if a class name contains a verb, often something has gone wrong.
I definitely feel you. And I feel your pain. Every name I think of just seems rubbish to me. It all seems so generic and I want to eventually learn how to inject a bit of flair and creativity into my names, making them really reflect what they describe.
One suggestion I have is to consult a Thesaurus. Word has a good one, as does Mac OS X. That can really help me get my head out of the clouds and gives me a good starting place as well as some inspiration.
If the name would explain itself to a lay programmer then there's probably no need to change it.

What OOP coding practices should you always make time for?

I tend to do a lot of projects on short deadlines and with lots of code that will never be used again, so there's always pressure/temptation to cut corners. One rule I always stick to is encapsulation/loose coupling, so I have lots of small classes rather than one giant God class. But what else should I never compromise on?
Update - thanks for the great response. Lots of people have suggested unit testing, but I don't think that's really appropriate to the kind of UI coding I do. Usability / User acceptance testing seems much important. To reiterate, I'm talking about the BARE MINIMUM of coding standards for impossible deadline projects.
Not OOP, but a practice that helps in both the short and long run is DRY, Don't Repeat Yourself. Don't use copy/paste inheritance.
Not a OOP practice, but common sense ;-).
If you are in a hurry, and have to write a hack. Always add a piece of comment with the reasons. So you can trace it back and make a good solution later.
If you never had the time to come back, you always have the comment so you know, why the solution was chosen at the moment.
Use Source control.
No matter how long it takes to set up (seconds..), it will always make your life easier! (still it's not OOP related).
Naming. Under pressure you'll write horrible code that you won't have time to document or even comment. Naming variables, methods and classes as explicitly as possible takes almost no additional time and will make the mess readable when you must fix it. From an OOP point of view, using nouns for classes and verbs for methods naturally helps encapsulation and modularity.
Unit tests - helps you sleep at night :-)
This is rather obvious (I hope), but at the very least I always make sure my public interface is as correct as possible. The internals of a class can always be refactored later on.
no public class with mutable public variables (struct-like).
Before you know it, you refer to this public variable all over your code, and the day you decide this field is a computed one and must have some logic in it... the refactoring gets messy.
If that day is before your release date, it gets messier.
Think about the people (may even be your future self) who have to read and understand the code at some point.
Application of the single responsibility principal. Effectively applying this principal generates a lot of positive externalities.
Like everyone else, not as much OOP practices, as much as practices for coding that apply to OOP.
Unit test, unit test, unit test. Defined unit tests have a habit of keeping people on task and not "wandering" aimlessly between objects.
Define and document all hierarchical information (namespaces, packages, folder structures, etc.) prior to writing production code. This helps to flesh out object relations and expose flaws in assumptions related to relationships of objects.
Define and document all applicable interfaces prior to writing production code. If done by a lead or an architect, this practice can additionally help keep more junior-level developers on task.
There are probably countless other "shoulds", but if I had to pick my top three, that would be the list.
Edit in response to comment:
This is precisely why you need to do these things up front. All of these sorts of practices make continued maintenance easier. As you assume more risk in the kickoff of a project, the more likely it is that you will spend more and more time maintaining the code. Granted, there is a larger upfront cost, but building on a solid foundation pays for itself. Is your obstacle lack of time (i.e. having to maintain other applications) or a decision from higher up? I have had to fight both of those fronts to be able to adopt these kinds of practices, and it isn't a pleasant situation to be in.
Of course everything should be Unit tested, well designed, commented, checked into source control and free of bugs. But life is not like that.
My personal ranking is this:
Use source control and actually write commit comments. This way you have a tiny bit of documentation should you ever wonder "what the heck did I think when I wrote this?"
Write clean code or document. Clean well-written code should need little documentation, as it's meaning can be grasped from reading it. Hacks are a lot different. Write why you did it, what you do and what you'd like to do if you had the time/knowledge/motivation/... to do it right
Unit Test. Yes it's down on number three. Not because it's unimportant but because it's useless if you don't have the other two at least halfway complete. Writing Unit tests is another level of documentation what you code should be doing (among others).
Refactor before you add something. This might sound like a typical "but we don't have time for it" point. But as with many of those points it usually saves more time than it costs. At least if you have at least some experience with it.
I'm aware that much of this has already been mentioned, but since it's a rather subjective matter, I wanted to add my ranking.
[insert boilerplate not-OOP specific caveat here]
Separation of concerns, unit tests, and that feeling that if something is too complex it's probably not conceptualised quite right yet.
UML sketching: this has clarified and saved any amount of wasted effort so many times. Pictures are great aren't they? :)
Really thinking about is-a's and has-a's. Getting this right first time is so important.
No matter how fast a company wants it, I pretty much always try to write code to the best of my ability.
I don't find it takes any longer and usually saves a lot of time, even in the short-term.
I've can't remember ever writing code and never looking at it again, I always make a few passes over it to test and debug it, and even in those few passes practices like refactoring to keep my code DRY, documentation (to some degree), separation of concerns and cohesion all seem to save time.
This includes crating many more small classes than most people (One concern per class, please) and often extracting initialization data into external files (or arrays) and writing little parsers for that data... Sometimes even writing little GUIs instead of editing data by hand.
Coding itself is pretty quick and easy, debugging crap someone wrote when they were "Under pressure" is what takes all the time!
At almost a year into my current project I finally set up an automated build that pushes any new commits to the test server, and man, I wish I had done that on day one. The biggest mistake I made early-on was going dark. With every feature, enhancement, bug-fix etc, I had a bad case of the "just one mores" before I would let anyone see the product, and it literally spiraled into a six month cycle. If every reasonable change had been automatically pushed out it would have been harder for me to hide, and I would have been more on-track with regard to the stakeholders' involvement.
Go back to code you wrote a few days/weeks ago and spend 20 minutes reviewing your own code. With the passage of time, you will be able to determine whether your "off-the-cuff" code is organized well enough for future maintenance efforts. While you're in there, look for refactoring and renaming opportunities.
I sometimes find that the name I chose for a function at the outset doesn't perfectly fit the function in its final form. With refactoring tools, you can easily change the name early before it goes into widespread use.
Just like everybody else has suggested these recommendations aren't specific to OOP:
Ensure that you comment your code and use sensibly named variables. If you ever have to look back upon the quick and dirty code you've written, you should be able to understand it easily. A general rule that I follow is; if you deleted all of the code and only had the comments left, you should still be able to understand the program flow.
Hacks usually tend to be convoluted and un-intuitive, so some good commenting is essential.
I'd also recommend that if you usually have to work to tight deadlines, get yourself a code library built up based upon your most common tasks. This will allow you to "join the dots" rather than reinvent the wheel each time you have a project.
Regards,
Docta
An actual OOP practice I always make time for is the Single Responsibility Principle, because it becomes so much harder to properly refactor the code later on when the project is "live".
By sticking to this principle I find that the code I write is easily re-used, replaced or rewritten if it fails to match the functional or non-functional requirements. When you end up with classes that have multiple responsibilities, some of them may fulfill the requirements, some may not, and the whole may be entirely unclear.
These kinds of classes are stressful to maintain because you are never sure what your "fix" will break.
For this special case (short deadlines and with lots of code that will never be used again) I suggest you to pay attention to embedding some script engine into your OOP code.
Learn to "refactor as-you-go". Mainly from an "extract method" standpoint. When you start to write a block of sequential code, take a few seconds to decide if this block could stand-alone as a reusable method and, if so, make that method immediately. I recommend it even for throw-away projects (especially if you can go back later and compile such methods into your personal toolbox API). It doesn't take long before you do it almost without thinking.
Hopefully you do this already and I'm preaching to the choir.