Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
After reading this wiki on the Stable Abstractions Principle (SAP) I was wondering if anyone knows any disadvantage to depending on abstractions rather than concretes (i suppose, that outweighs the advantages).
The SAP states that the more stable a package the more abstract it should be. This implies that if a package is less stable (more likely to change) then it should be more concrete. What i don't really understand is why this should be the case. Surely in all cases regardless of stability we should be depending upon abstractions and hiding the concrete implementation?
Robert C. Martin always had a rather obscure way of describing things. His points are always very good but require a bit of deciphering -- "afferent vs. efferent coupling", ugh! Another thing about the way Martin writes is that it's always kind of blurred between descriptive and prescriptive ("would" or "should"?)
"Stability"
First it's important to understand how Martin defines "stability". He defines it in terms of afferent and efferent couplings yielding a stability metric:
instability = efferent / (efferent + afferent)
"Afferent" and "efferent" are such obscure terms. For simplicity, let's use "outgoing dependencies" in place of "efferent couplings" and "incoming dependencies" for "afferent couplings". So we have this:
instability = outgoing / (outgoing + incoming)
It's very much divorced from the likelihood of change, and has everything to do with the difficulty of change. As confusing as it is, by this definition, a "stable" package could still be changing all the time (it would be bad and really difficult to manage though, of course).
If you get a divide by zero error with the above formula, then your package is neither being used nor using anything.
Stable Dependencies Principle
To understand Martin's point about SAP in context, it's easier to start with SDP (Stable Dependencies Principle). It states:
The dependencies between packages should be in the direction of the
stability of the packages. A package should only depend upon packages
that are more stable than it is.
That's pretty easy to understand. The cost of changing a design cascades with the number (and complexity) of incoming dependencies to it. Probably anyone who has worked in a large-scale codebase can appreciate this one pretty quickly where a central design change might end up wanting to break 10,000 really complex parts in the codebase.
So the dependencies should (would?) flow towards the parts that are unchanging, firmly-rooted, unwavering, like a tree flowing down from its leaves towards its roots.
The stability metrics that the roots should have boil down to zero efferent couplings (zero outgoing dependencies). That is, this stable "root" package should not depend on anything else. In other words, it should be totally independent of the outside world. This is the characteristic that defines "maximum stability" according to Martin's metrics: total independence.
Maximum independence = "stable root" (as I'm calling it)
Maximum dependence = "unstable leaf" (as I'm calling it)
Given this kind of totally-independent, ultra stable root design, how can we still gain back a level of flexibility where we can easily extend and change its implementation without affecting the interface/design? And that's where abstractions come in.
Stable Abstractions Principle
Abstractions allow us to decouple implementation from interface/design.
And thus, here comes the stable abstractions principle:
Packages that are maximally stable should be maximally abstract.
Unstable packages should be concrete. The abstractness of a package
should be in proportion to its stability.
The idea is to allow these central root designs to be ultra-stable, as stated by SDP, while still retaining a degree of flexibility for changes which do not impact the core design through abstraction.
As a simple example, consider a software development kit at the heart of some engine and used by plugin developers worldwide. By definition, this SDK would have to have a very stable design given the combination of numerous incoming dependencies (all these plugin developers using it) against minimal or no outgoing dependencies (the SDK depends on little else). This principle would suggest that its interfaces should be abstract to have the maximum degree of flexibility for change without impacting the stable design.
"Moderately abstract" here might be an abstract base class. "Maximally abstract" would be a pure interface.
Concrete
On the flip side, with the abstract is a need for the concrete. Otherwise there would be nothing to provide the implementation for an abstraction. So this principle also suggests that the concrete parts should (would?) be the unstable parts. If you imagine this as a tree (inverted from the usual programming tree) with dependencies flowing downwards from leaf to root, the leaves should be the most concrete, the roots should be the most abstract.
The leaves would typically have the most outgoing dependencies (lots of dependencies to things outside -- to all those branches and roots), while they would have zero incoming dependencies (nothing would depend on them). The roots would be opposite (everything depends on them, they depend on nothing).
This is how I've come to understand Martin's descriptions. They are difficult to understand and I may be off on some parts.
Surely in all cases regardless of stability we should be depending
upon abstractions and hiding the concrete implementation?
Perhaps you're thinking more in terms of entities. An abstract interface for an entity would still require a concrete implementation somewhere. The concrete part may be unstable, and would likewise be easier to change since nothing else depends on it directly (no afferent couplings). The abstract part should be stable as many could potentially depend on it (lots of incoming dependencies, few or no outgoing dependencies), and so it would be difficult to change.
At the same time, if you work your way up to a more dependent package like the application package where you have your main entry point for your application where everything is assembled together, to make all the interfaces abstract here would often increase the difficulty of change, and would still transfer the need to have a concrete (unstable) implementation somewhere else. At some point in a codebase, there has to be dependencies to concrete parts, if only to select the appropriate concrete implementation for an abstract interface.
To Abstract or Not to Abstract
I was wondering if anyone knows any disadvantage to depending on
abstractions rather than concretes (i suppose, that outweighs the
advantages).
Performance comes to mind. Typically abstractions have some kind of runtime cost in the form of dynamic dispatch, e.g., which then become susceptible to branch mispredictions. A lot of Martin's writing revolves around classical object-oriented paradigms. Moreover, OOP in general wants to model things at the singular entity kind of level. At the extreme level, it might want to make a single pixel of an image into an abstract interface with its own operations.
In my field, I tend to use entity-component systems with a data-oriented design mindset. This kind of flips the classical OOP world upside down. Structures are often designed to aggregate data for multiple entities at once with a design mindset looking for optimal memory layout (designing for the machine rather than logically for the human). Entities are designed as collections of components, and components are modeled as raw data using a data-oriented mindset. Interfaces still get abstract for systems that process components, but the abstractions are designed to process things in bulk, and the dependencies flow from systems to central components which are not abstract in the slightest.
This is a very common technique employed in game engines and it offers a lot potential in terms of performance and flexibility. Yet it is in stark contrast to the kind of focus Martin places on object-oriented programming, as it is strong departure from OOP overall.
first of all, from the paper you link to:
stability is not a measure of the likelihood that a module will
change; rather it is a measure of the difficulty in changing a module
so things hard to change (e.g. used in many places) should be abstract to make the extension easy/possible.
and yes, there are disadvantages. it's the easiness of change. it's much easier and faster to change the concrete code rather than abstraction and the code.
Surely in all cases regardless of stability we should be depending
upon abstractions and hiding the concrete implementation?
that is true. but level of abstraction differs. on-the-fly example: if i ask you to compute length of a square diagonal then you will probably just use build-in double sqrt(double) function. is it abstracted? yes. we don't know if there is a newton method used or is it delegated directly to the cpu.
but what if we want to create a sqrt function and rely some kind of physics calculations library on it? is the previous abstraction enough in this case? probably not as we may want to handle (in a uniform way) matrices, relative errors, arbitrary length numbers, parallelization for desired number of cores/threads, maybe delegating to gpu and it should be prepared for other extensions because sooner or later someone may want it to handle NaNs and imaginary numbers.
so it's still sqrt function but level of abstraction is a bit higher. and that's only because lots of code will depend on it. and which function is easier to change?
This implies that if a package is less stable (more likely to change)
then it should be more concrete. What i don't really understand is why
this should be the case.
Abstractions are things that are hard to change in the software because everything depend on them. If your package is going to change often and it provides abstractions, people who depend on it will be forced to rewrite a big bunch of their code when you change something. But if your unstable package provides some concrete implementations, much lesser code will have to be rewritten after changes.
So, if your package is going to change often, it should better provide concretes, not abstractions. Otherwise... who the hell will use it? ;)
In relation with crosscutting concerns and aspect oriented programming, you often read about code tangling. This article 1 desciribes code tangling as:
Modules in a software system may simultaneously interact with several requirements. For example, oftentimes developers simultaneously think about business logic, performance, synchronization, logging, and security. Such a multitude of requirements results in the simultaneous presence of elements from each concern's implementation, resulting in code tangling.
Isn't that exactly the same as low cohesion? Is there any difference between high tangling and low cohesion, or are that two different words describing the same thing?
Accourding to wikipedia:
The implementation of a concern is tangled if its code is intermixed
with code that implements other concerns. The module in which tangling
occurs is not cohesive.
Cohesion is decreased if:
- The functionalities embedded in a class, accessed through its methods, have little in common.
- Methods carry out many varied activities, often using coarsely-grained or unrelated sets of data.
So.. when the code is tangled, it would violate SOLID principles such Single Responsibility Principle, Open Closed Priciple etc.
All these principles most often go together and violation of one principle/best practice lead to another.
But tangling doesn't necessarily mean that the code is not cohesive.
For example we could have a class called SecurityChecker, which does the authentication of a user log all authentication related activties.
Clearly this would be handling multiple concerns which are Authentication and Logging. Therefroe it would be a tangled class.
On the other hand both these conerns would be operating on the same set of data which in this case could be user data, times of logon , number of login attempts etc. Therefore cohesion could still be high.
Genrally most of these principles/guidlines/best practices look at the same issue from different perspectives, and the end goal is to manage dependencies between different components/classes etc so that the overall design would be more maintainable , efficient and elegant in the long run.
Very similar yes.
Cohesion is used to indicate the degree to which a class has a single, well-focused purpose.
Therefore if you have a Class with a single well-focused purpose then it would follow that it's not "tangled" by trying to do more than one thing.
From what I understand, OOP is the most commonly used paradigm for large scale projects. I also know that some smaller subsets of big systems use other paradigms (e.g. SQL, which is declarative), and I also realize that at lower levels of computing OOP isn't really feasible. But it seems to me that usually the pieces of higher level solutions are almost always put together in a OOP fashion.
Are there any scenarios where a truly non-OOP paradigm is actually a better choice for a largescale solution? Or is that unheard of these days?
I've wondered this ever since I've started studying CS; it's easy to get the feeling that OOP is some nirvana of programming that will never be surpassed.
In my opinion, the reason OOP is used so widely isn't so much that it's the right tool for the job. I think it's more that a solution can be described to the customer in a way that they understand.
A CAR is a VEHICLE that has an ENGINE. That's programming and real world all in one!
It's hard to comprehend anything that can fit the programming and real world quite so elegantly.
Linux is a large-scale project that's very much not OOP. And it wouldn't have a lot to gain from it either.
I think OOP has a good ring to it, because it has associated itself with good programming practices like encapsulation, data hiding, code reuse, modularity et.c. But these virtues are by no means unique to OOP.
You might have a look at Erlang, written by Joe Armstrong.
Wikipedia:
"Erlang is a general-purpose
concurrent programming language and
runtime system. The sequential subset
of Erlang is a functional language,
with strict evaluation, single
assignment, and dynamic typing."
Joe Armstrong:
“Because the problem with
object-oriented languages is they’ve
got all this implicit environment that
they carry around with them. You
wanted a banana but what you got was a
gorilla holding the banana and the
entire jungle.”
The promise of OOP was code reuse and easier maintenance. I am not sure it delivered. I see things such as dot net as being much the same as the C libraries we used to get fro various vendors. You can call that code reuse if you want. As for maintenance bad code is bad code. OOP did not help.
I'm the biggest fan of OOP, and I practice OOP every day.
It's the most natural way to write code, because it resembles the real life.
Though, I realize that the OOP's virtualization might cause performance issues.
Of course that depends on your design, the language and the platform you chose (systems written in Garbage collection based languages such as Java or C# might perform worse than systems which were written in C++ for example).
I guess in Real-time systems, procedural programming may be more appropriate.
Note that not all projects that claim to be OOP are in fact OOP. Sometimes the majority of the code is procedural, or the data model is anemic, and so on...
Zyx, you wrote, "Most of the systems use relational databases ..."
I'm afraid there's no such thing. The relational model will be 40 years old next year and has still never been implemented. I think you mean, "SQL databases." You should read anything by Fabian Pascal to understand the difference between a relational dbms and an SQL dbms.
" ... the relational model is usually chosen due to its popularity,"
True, it's popular.
" ... availability of tools,"
Alas without the main tool necessary: an implementation of the relational model.
" support,"
Yup, the relational model has fine support, I'm sure, but it's entirely unsupported by a dbms implementation.
" and the fact that the relational model is in fact a mathematical concept,"
Yes, it's a mathematical concept, but, not being implemented, it's largely restricted to the ivory towers. String theory is also a mathematical concept but I wouldn't implement a system with it.
In fact, despite it's being a methematical concept, it is certainly not a science (as in computer science) because it lacks the first requirement of any science: that it is falsifiable: there's no implementation of a relational dbms against which we can check its claims.
It's pure snake oil.
" ... contrary to OOP."
And contrary to OOP, the relational model has never been implemented.
Buy a book on SQL and get productive.
Leave the relational model to unproductive theorists.
See this and this. Apparently you can use C# with five different programming paradigms, C++ with three, etc.
Software construction is not akin to Fundamental Physics. Physics strive to describe reality using paradigms which may be challenged by new experimental data and/or theories. Physics is a science which searches for a "truth", in a way that Software construction doesn't.
Software construction is a business. You need to be productive, i.e. to achieve some goals for which someone will pay money. Paradigms are used because they are useful to produce software effectively. You don't need everyone to agree. If I do OOP and it's working well for me, I don't care if a "new" paradigm would potentially be 20% more useful to me if I had the time and money to learn it and later rethink the whole software structure I'm working in and redesign it from scratch.
Also, you may be using another paradigm and I'll still be happy, in the same way that I can make money running a Japanese food restaurant and you can make money with a Mexican food restaurant next door. I don't need to discuss with you whether Japanese food is better than Mexican food.
I doubt OOP is going away any time soon, it just fits our problems and mental models far too well.
What we're starting to see though is multi-paradigm approaches, with declarative and functional ideas being incorporated into object oriented designs. Most of the newer JVM languages are a good example of this (JavaFX, Scala, Clojure, etc.) as well as LINQ and F# on the .net platform.
It's important to note that I'm not talking about replacing OO here, but about complementing it.
JavaFX has shown that a declarative
solution goes beyond SQL and XSLT,
and can also be used for binding
properties and events between visual
components in a GUI
For fault tolerant and highly
concurrent systems, functional
programming is a very good fit,
as demonstrated by the Ericsson
AXD301 (programmed using Erlang)
So... as concurrency becomes more important and FP becomes more popular, I imagine that languages not supporting this paradigm will suffer. This includes many that are currently popular such as C++, Java and Ruby, though JavaScript should cope very nicely.
Using OOP makes the code easier to manage (as in modify/update/add new features) and understand. This is especially true with bigger projects. Because modules/objects encapsulate their data and operations on that data it is easier to comprehend the functionality and the big picture.
The benefit of OOP is that it is easier to discuss (with other developers/management/customer) a LogManager or OrderManager, each of which encompass specific functionality, then describing 'a group of methods that dump the data in file' and 'the methods that keep track of order details'.
So I guess OOP is helpful especially with big projects but there are always new concepts turning up so keep on lookout for new stuff in the future, evaluate and keep what is useful.
People like to think of various things as "objects" and classify them, so no doubt that OOP is so popular. However, there are some areas where OOP has not gained a bigger popularity. Most of the systems use relational databases rather than objective. Even if the second ones hold some notable records and are better for some types of tasks, the relational model is unsually chosen due to its popularity, availability of tools, support and the fact that the relational model is in fact a mathematical concept, contrary to OOP.
Another area where I have never seen OOP is the software building process. All the configuration and make scripts are procedural, partially because of the lack of the support for OOP in shell languages, partially because OOP is too complex for such tasks.
Slightly controversial opinion from me but I don't find OOP, at least of a kind that is popularly applied now, to be that helpful in producing the largest scale software in my particular domain (VFX, which is somewhat similar in scene organization and application state as games). I find it very useful on a medium to smaller scale. I have to be a bit careful here since I've invited some mobs in the past, but I should qualify that this is in my narrow experience in my particular type of domain.
The difficulty I've often found is that if you have all these small concrete objects encapsulating data, they now want to all talk to each other. The interactions between them can get extremely complex, like so (except much, much more complex in a real application spanning thousands of objects):
And this is not a dependency graph directly related to coupling so much as an "interaction graph". There could be abstractions to decouple these concrete objects from each other. Foo might not talk to Bar directly. It might instead talk to it through IBar or something of this sort. This graph would still connect Foo to Bar since, albeit being decoupled, they still talk to each other.
And all this communication between small and medium-sized objects which make up their own little ecosystem, if applied to the entire scale of a large codebase in my domain, can become extremely difficult to maintain. And it becomes so difficult to maintain because it's hard to reason about what happens with all these interactions between objects with respect to things like side effects.
Instead what I've found useful is to organize the overall codebase into completely independent, hefty subsystems that access a central "database". Each subsystem then inputs and outputs data. Some other subsystems might access the same data, but without any one system directly talking to each other.
... or this:
... and each individual system no longer attempts to encapsulate state. It doesn't try to become its own ecosystem. It instead reads and writes data in the central database.
Of course in the implementation of each subsystem, they might use a number of objects to help implement them. And that's where I find OOP very useful is in the implementation of these subsystems. But each of these subsystems constitutes a relatively medium to small-scale project, not too large, and it's at that medium to smaller scale that I find OOP very useful.
"Assembly-Line Programming" With Minimum Knowledge
This allows each subsystem to just focus on doing its thing with almost no knowledge of what's going on in the outside world. A developer focusing on physics can just sit down with the physics subsystem and know little about how the software works except that there's a central database from which he can retrieve things like motion components (just data) and transform them by applying physics to that data. And that makes his job very simple and makes it so he can do what he does best with the minimum knowledge of how everything else works. Input central data and output central data: that's all each subsystem has to do correctly for everything else to work. It's the closest thing I've found in my field to "assembly line programming" where each developer can do his thing with minimum knowledge about how the overall system works.
Testing is still also quite simple because of the narrow focus of each subsystem. We're no longer mocking concrete objects with dependency injection so much as generating a minimum amount of data relevant to a particular system and testing whether the particular system provides the correct output for a given input. With so few systems to test (just dozens can make up a complex software), it also reduces the number of tests required substantially.
Breaking Encapsulation
The system then turns into a rather flat pipeline transforming central application state through independent subsystems that are practically oblivious to each other's existence. One might sometimes push a central event to the database which another system processes, but that other system is still oblivious about where that event came from. I've found this is the key to tackling complexity at least in my domain, and it is effectively through an entity-component system.
Yet it resembles something closer to procedural or functional programming at the broad scale to decouple all these subsystems and let them work with minimal knowledge of the outside world since we're breaking encapsulation in order to achieve this and avoid requiring the systems to talk to each other. When you zoom in, then you might find your share of objects being used to implement any one of these subsystems, but at the broadest scale, the systems resembles something other than OOP.
Global Data
I have to admit that I was very hesitant about applying ECS at first to an architectural design in my domain since, first, it hadn't been done before to my knowledge in popular commercial competitors (3DS Max, SoftImage, etc), and second, it looks like a whole bunch of globally-accessible data.
I've found, however, that this is not a big problem. We can still very effectively maintain invariants, perhaps even better than before. The reason is due to the way the ECS organizes everything into systems and components. You can rest assured that an audio system won't try to mutate a motion component, e.g., not even under the hackiest of situations. Even with a poorly-coordinated team, it's very improbable that the ECS will degrade into something where you can no longer reason about which systems access which component, since it's rather obvious on paper and there are virtually no reasons whatsoever for a certain system to access an inappropriate component.
To the contrary it often removed many of the former temptations for hacky things with the data wide open since a lot of the hacky things done in our former codebase under loose coordination and crunch time was done in hasty attempts to x-ray abstractions and try to access the internals of the ecosystems of objects. The abstractions started to become leaky as a result of people, in a hurry, trying to just get and do things with the data they wanted to access. They were basically jumping through hoops trying to just access data which lead to interface designs degrading quickly.
There is something vaguely resembling encapsulation still just due to the way the system is organized since there's often only one system modifying a particular type of components (two in some exceptional cases). But they don't own that data, they don't provide functions to retrieve that data. The systems don't talk to each other. They all operate through the central ECS database (which is the only dependency that has to be injected into all these systems).
Flexibility and Extensibility
This is already widely-discussed in external resources about entity-component systems but they are extremely flexible at adapting to radically new design ideas
in hindsight, even concept-breaking ones like a suggestion for a creature which is a mammal, insect, and plant that sprouts leaves under sunlight all at once.
One of the reasons is because there are no central abstractions to break. You introduce some new components if you need more data for this or just create an entity which strings together the components required for a plant, mammal, and insect. The systems designed to process insect, mammal, and plant components then automatically pick it up and you might get the behavior you want without changing anything besides adding a line of code to instantiate an entity with a new combo of components. When you need whole new functionality, you just add a new system or modify an existing one.
What I haven't found discussed so much elsewhere is how much this eases maintenance even in scenarios when there are no concept-breaking design changes that we failed to anticipate. Even ignoring the flexibility of the ECS, it can really simplify things when your codebase reaches a certain scale.
Turning Objects Into Data
In a previous OOP-heavy codebase where I saw the difficulty of maintaining a codebase closer to the first graph above, the amount of code required exploded because the analogical Car in this diagram:
... had to be built as a completely separate subtype (class) implementing multiple interfaces. So we had an explosive number of objects in the system: a separate object for point lights from directional lights, a separate object for a fish eye camera from another, etc. We had thousands of objects implementing a few dozen abstract interfaces in endless combinations.
When I compared it to ECS, that required only hundreds and we were able to do the exact same things before using a small fraction of the code, because that turned the analogical Car entity into something that no longer requires its class. It turns into a simple collection of component data as a generalized instance of just one Entity type.
OOP Alternatives
So there are cases like this where OOP applied in excess at the broadest level of the design can start to really degrade maintainability. At the broadest birds-eye view of your system, it can help to flatten it and not try to model it so "deep" with objects interacting with objects interacting with objects, however abstractly.
Comparing the two systems I worked on in the past and now, the new one has more features but takes hundreds of thousands of LOC. The former required over 20 million LOC. Of course it's not the fairest comparison since the former one had a huge legacy, but if you take a slice of the two systems which are functionally quite equal without the legacy baggage (at least about as close to equal as we might get), the ECS takes a small fraction of the code to do the same thing, and partly because it dramatically reduces the number of classes there are in the system by turning them into collections (entities) of raw data (components) with hefty systems to process them instead of a boatload of small/medium objects.
Are there any scenarios where a truly non-OOP paradigm is actually a
better choice for a largescale solution? Or is that unheard of these
days?
It's far from unheard of. The system I'm describing above, for example, is widely used in games. It's quite rare in my field (most of the architectures in my field are COM-like with pure interfaces, and that's the type of architecture I worked on in the past), but I've found that peering over at what gamers are doing when designing an architecture made a world of difference in being able to create something that still remains very comprehensible at it grows and grows.
That said, some people consider ECS to be a type of object-oriented programming on its own. If so, it doesn't resemble OOP of a kind most of us would think of, since data (components and entities to compose them) and functionality (systems) are separated. It requires abandoning encapsulation at the broad system level which is often considered one of the most fundamental aspects of OOP.
High-Level Coding
But it seems to me that usually the pieces of higher level solutions
are almost always put together in a OOP fashion.
If you can piece together an application with very high-level code, then it tends to be rather small or medium in scale as far as the code your team has to maintain and can probably be assembled very effectively using OOP.
In my field in VFX, we often have to do things that are relatively low-level like raytracing, image processing, mesh processing, fluid dynamics, etc, and can't just piece these together from third party products since we're actually competing more in terms of what we can do at the low-level (users get more excited about cutting-edge, competitive production rendering improvements than, say, a nicer GUI). So there can be lots and lots of code ranging from very low-level shuffling of bits and bytes to very high-level code that scripters write through embedded scripting languages.
Interweb of Communication
But there comes a point with a large enough scale with any type of application, high-level or low-level or a combo, that revolves around a very complex central application state where I've found it no longer useful to try to encapsulate everything into objects. Doing so tends to multiply complexity and the difficulty to reason about what goes on due to the multiplied amount of interaction that goes on between everything. It no longer becomes so easy to reason about thousands of ecosystems talking to each other if there isn't a breaking point at a large enough scale where we stop modeling each thing as encapsulated ecosystems that have to talk to each other. Even if each one is individually simple, everything taken in as a whole can start to more than overwhelm the mind, and we often have to take a whole lot of that in to make changes and add new features and debug things and so forth if you try to revolve the design of an entire large-scale system solely around OOP principles. It can help to break free of encapsulation at some scale for at least some domains.
At that point it's not necessarily so useful anymore to, say, have a physics system encapsulate its own data (otherwise many things could want to talk to it and retrieve that data as well as initialize it with the appropriate input data), and that's where I found this alternative through ECS so helpful, since it turns the analogical physics system, and all such hefty systems, into a "central database transformer" or a "central database reader which outputs something new" which can now be oblivious about each other. Each system then starts to resemble more like a process in a flat pipeline than an object which forms a node in a very complex graph of communication.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 6 years ago.
Improve this question
I'm trying to tell someone his code is not "coherent" in the sense that it serves multiple purposes. I don't think I can explain it very well, so I'm looking for a good reference and/or definition.
I think the correct term is cohesion.
In computer programming, cohesion is a measure of how strongly-related and focused the various responsibilities of a software module are. Cohesion is an ordinal type of measurement and is usually expressed as "high cohesion" or "low cohesion" when being discussed.
Modules with high cohesion tend to be preferable because high cohesion is associated with several desirable traits of software including robustness, reliability, reusability, and understandability whereas low cohesion is associated with undesirable traits such as being difficult to maintain, difficult to test, difficult to reuse, and even difficult to understand.
I had Code Complete by Steve McConnell next to my computer (ie the programmers bible) with the page open explaining cohesion so I thought I'd share,
Cohesion arose from structured design
and is usually discussed in the same
context as coupling. Cohesion refers
to how closely all the routines in a
class or all the code in a routine
support a central purpose-how focused
the class is. Classes that contain
strongly related functionality are
described as having strong cohesion,
and the heuristic goal is to make
cohesion as strong as possible.
I use the term “separation of concerns” to explain this while refactoring. Often when code is fairly new, things will get lumped in together as the separate concerns are not clear at first.
One easy way to illustrate this to your co worker would be to ask them to write test cases for the code. This should illustrate that the code is not clear or coherent.
Another good phrase to use is that functions/objects “should do one thing, and do it well”, this has implications in everything from the object/method names to the overall architecture of the system.
In addition to the answers given so far, a simple way to think of high cohesion is lack of duplication of functionality, and clear seperation of related functionality into distinct modules, components or classes. Thus if you want a function similar to another function, and you cut and paste and subsequently modify a copy of the code, you are reducing cohesion. If you modify the the original to handle the new case, where the new case is clearly related to the existing functionality, you are increasing cohesion. Put another way, if your program has to do a given thing, no matter how times or in how many places, for maximum cohesion there should only be once piece of code that does that thing. At the same, a given class, module or component should have a single area of responsibility. Lumping unrelated functionality into a single class or component also reduces cohesion.
As CodeWiki says, cohesion is typically discussed with coupling, where the two can act in opposition to one another, particularly where strict interfaces aren't carefully planned. Many of the googled articles on cohesion relate to OO design, but cohesion and coupling are not restricted to OO.
Marked by an orderly, logical, and aesthetically consistent relation of parts; "a coherent argument" - from http://www.websters-online-dictionary.org/definition/coherent