What duplication detection threshold do you use? [closed] - code-analysis

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
We all agree that duplication is evil and should be avoid (Don't Repeat Yourself principle).
To ensure that, static analysis code should be used like Simian (Multi Language) or Clone Detective (Visual Studio add-in)
I just read Ayende's post about Kobe where he is saying that :
8.5% of Kobe is copy & pasted code. And that is with the sensitivity
dialed high, if we set the threshold
to 3, which is what I commonly do, is
goes up to 12.5%.
I think that 3 as threshold is very low.
In my company we offer quality code analysis as a service, our default threshold for duplication is set to 20 and there is a lot of duplications. I can't imagine if we set it to 3, it would be impossible for our customer to even think about correction.
I understand Ayende's opinion about Kobe: it's an official sample and is marketed as “intended to guide you with the planning, architecting, and implementing of Web 2.0 applications and services.” so the expectation of quality is high.
But for your project what minimum threshold do you use for duplication?
Related question : How fanatically do you eliminate Code Duplication?

Three is a good rule of thumb, but it depends. Refactoring to eliminate duplication often involves trading conceptual simplicity of the codebase and API for a smaller codebase that is more maintainable once someone does understand it. I generally evaluate things in this light.
At one extreme, if fixing the duplication makes the code more readable and adds little or nothing to the conceptual complexity of the code, then any duplication is unacceptable. An example of this would be whenever the duplicated code factors out neatly into a simple referentially transparent function that does something that's easy to explain and name.
When a more complex, heavyweight solution, such as metaprogramming, OO design patterns, etc. is necessary, I may allow 4 or 5 instances, especially if the duplicated snippet is small. In these cases I feel that the conceptual complexity of the solution makes the cure worse than the ill until there are really a lot of instances.
In the most extreme case, where the codebase I'm working with is a very rapidly evolving prototype and I don't know enough about what direction the project may evolve in to draw abstraction lines that are both reasonably simple and reasonably future-proof, I just give up. In a codebase like this, I think it's better to just focus on expediency and getting things done than good design, even if the same piece of code is duplicated 20 times. Often the parts of the prototype that are creating all that duplication are the ones that will be discarded relatively soon anyhow, and once you know what parts of the prototype will be kept, you can always refactor these. Without the additional constraints created by the parts that will be discarded, refactoring is often easier at this stage.

I don't know what is a good 'metric' for it, but I would say the thing I usually strive for is
if you have the same code in two places, and
the code is the same by intent (rather than merely coincidentally the same)
then refactor to get rid of the duplication. All duplication is bad. I'll rarely let code be in two places, and at three, it definitely has to go.

I'm personally pretty fanatic about it. I try to design my projects to avoid code duplication. I do have the goal to get the threshold in the lower single digits, if I can't get there it means my design is not well enough and I need to go back to the drawing board or refactor.

Depends on the programming language. (The "Clone Detective" guy seems to recognize this: "programming language constraints" is one of the boxes in his first presentation.)
In a Lisp program, any duplicate expression is easily subject to refactoring -- I guess you'd call that threshold 2. Everything is comprised of expressions, and there are macros to translate expressions, so there's rarely an excuse for duplicating anything even once. (About the only thing I can think of which would be hard to extract would be something like LOOP clauses, and in fact many people advocate avoiding LOOP for just that reason.)
In many other languages, programs consist of classes which have methods which have statements, so it's harder to just pull out an expression and use it in two different files. Often it means changing the structure of the thing as you extract it. Often there's also a requirement of type safety, which can be limiting (unless you want to write a whole lot of reflection code all the time to escape it, but if you do, you shouldn't be using a static language). If I made my current statically-typed program perfectly DRY, it would be neither shorter nor easier to maintain.
I guess the upshot of this is that what we really want is "easy to maintain". Sometimes, in some languages, that means "just put a comment here that says what you copied and why". DRY is a good indicator of maintainable code. If you're repeating a lot, that's not a good sign. But chasing any single statistic also tends to be bad -- otherwise, we'd solve all our problems by simply optimizing for that.

I think we need to combine
Number of lines that is duplicate
Number of times duplicate has been copied
How “close” are the duplicates to each other
e.g if there are in different products that just happen to be in the same source code control system, it is very different then if they are in the same method.
Time since any of the methods that contain the duplicate code have been changed.
So as to get a good trade-off between cost/benefit of removing the duplicates.

Our CloneDR finds duplicate code, both exact copies and near-misses, across large source systems, parameterized by langauge syntax. It supports Java, C#, COBOL, C++, PHP and many other languages.
It accepts a number of parameters to define "What is a clone?", including:
a) Similarilty threshold, controlling how similar two blocks of code must
be to be declaraed clones (typically 95% is good)
b) number of lines minimum clone size (3 tends to be a good choice)
c) number of parameters (distinct changes to the text; 5 tends to be a good choice)
With these settings, it tends to find 10-15% redundant code in virtually
everything it processes.

Related

Is it really so terrible to use global variables? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I've been developing in Objective-C for a long time, mostly games, and lots of people have commented that I frequently use global variables. For example, in my .m file before the #implementation I mostly have:
BOOL X=0;
int y=1;
NSString *ran;
.
.
.
Now, I know I have to use properties, but I've found it much clearer for me to use this global variable, and I'm keeping them safe.
Except for the fact that it is not object oriented and/or not acceptable, does it affect any other facet of my app, like processor operation?
In my games I have something like 40 booleans in 1 class that are shared by almost ALL methods. I find it almost impossible to write getter/setter, or use properties for all of them, and I am comfortable with my way. But is it so wrong?
Is there another way to deal with many booleans that change in real time frequently, and are shared by all methods?
Is this so terrible to use globals? (I dont need to be considered as good Objective-C user…)
Global variables are generally considered bad practice because they lead to coding problems in the long run. If you find yourself able to maintain your code, then go ahead.
But eventually, you'll work on a project big enough where it will cause problems. Why not learn to get along without them now on the easier projects?
ObjC's not much different from its relatives in this regard.
The problem is that the program is very difficult to use in other contexts. That is, it can be a better choice to reimplement a program entirely rather than making the one with 40 globals reusable (and retesting everything).
40 Booleans for one class is also a llllloooooooot. Read your code -- look for patterns. Make smaller, more easily reusable implementations if you want to get away from the globals. Many devs consider them huge maintenance pains (war stories!). I could easily see myself having trouble trying to understand the program flow of such a program.
Even packing your 40 bools into a C struct and putting an instance of that struct in your ObjC class will be one huge improvement which is simple to implement.
If you have had no problems maintaining these programs, consider it a blessing! …but it will not be a favorite design for other people that will read, extend, or maintain said program.
As with most development practices, global variables have their place, BUT they reduce how refactorable, readable and debuggable the code is. Imagine the following scenarios:
a program that has an error in one function, but the error is caused by a global variable. Which other location produced the bad value? It's nearly impossible to tell.
someone who didn't write the code wants to change something, but has no idea where the global values are coming from. The way to figure out how the program works is to understand EVERY place the global variable is used. This is much more difficult than if you had simply encapsulated your functionality appropriately.
one piece of code is repeated over and over again (every method has access to your global so they all use it, but in just barely different ways). Requirements change and you need to change how that works slightly. You now have to change 143 different places in the code. (one time I had to do this was when the software changed from the English system to metric. 30 different code locations all using DIFFERENT conversion values to do the same thing)
On the other hand, if you have performance issues, there are times when having a global will speed things up, but it's much better to code for readability, refractorability and debuggability and then refactor if necessary for performance.

OO - resource spending and design confusion? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
sometimes i'm using OO des. and sometimes procedural style and everytime i use oop i feel like wasting resources on nothing. say i have a situation where i need to grab some values from datasource, a pool of bannerinfo. For the further work i can declare a banner class and decorators for additional functionality, but why would i do such a hard sequence - i got to grab, instantiate objects, fill them, wrap and so on, rather then just: grab data, run procedural code on data; yeah in many times oop just helps to organize logic and make decisions flexible, but on the other hand it's a waste of time on design (i experience a lot of problems solving simple stuff while putting them into oop style) and obviously a waste of machine resources. i'm kinda stuck in that mindset, im young but i've already seen some projects in oop - i wouldn't say that they're easy-understandable; that idead of oop is pretty charming - organising, making logically, but...
So, would you mind to point out some difference between situations when i should use oop/procedural styles. I'll appriciate any links to additional literature on that topic.Thanks!.
That data you're grabbing has a structure to it, i.e. the order in which the fields show up within each record in the data source. The code you want to run on that data is closely bound to that structure (i.e. the code is not going to apply to other data structures, and if the data structure changes you certainly want to change the code). So it makes sense to keep the data and behaviour together from a "mental information management" point of view, and object are a great way to do this.
What if your program grows, and you want to iterate through bannerinfo in multiple places within the project? Of course you could create a routine available from the whole program which does what you want on the bannerinfo, and call that from each point where you need it. But what if you then think of other things you want to do with a bannerinfo? Of course you could just create another routine available from the whole program, but it would be completely separate from the first. What if these two routines had some code in common that you could push out to a separate routine, would you create yet another routine available from the whole program, even though it's only used by the other two?
With OOP you'd have a class with two public methods, and one private one for that third routine. Why is this different to having three routines available to the whole program? The answer is clutter. You can create as many additional methods on that class, and it won't add clutter to the parts where you're not using that particular class as they won't be available. If the data structure of bannerinfo changes, you only need to go to one place to make the changes.
Of course there's more, but I hope this helps demonstrate where OOP can be useful. Its all about making it easy to manage. If your specifc problem doesn't care for that because it is a one-off, or will never grow, then there's not necessarily any benefit.
Final note: whether the benefit is worth the effort also depends on other factors such as how comfortable you are with using objects, what you're trying to do with them (inheritance can get murky), and also on the language and syntax itself.
"grab data, run procedural code on data"
I don't see how dealing with data can be easier with procedural. With OOP you can do stuff like
$users = $db->from('users')->where('score',100,'>')->getMany();
Or with an ORM:
$user = $orm->entity('User')->findOne($id);
$user->setPassword('abc123'); // set a new password
$orm->save($user);
About showing the data (also called 'the view' in MVC architecture), I have to agree that decorators can be annoying. But if you use a templating engine, things are easy as they can get. You didn't mention which language you are using, but if you are into PHP you can use Twig
Personally, I feel more comfortable with OOP even in small projects, where you don't even do things like unit-testing. But I think the best of OOP comes when you need maintainability, collaboration, reusability, etc.

SOLID vs. YAGNI [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
One of the most frequent arguments I hear for not adhering to the SOLID principles in object-oriented design is YAGNI (although the arguer often doesn't call it that):
"It is OK that I put both feature X and feature Y into the same class. It is so simple why bother adding a new class (i.e. complexity)."
"Yes, I can put all my business logic directly into the GUI code it is much easier and quicker. This will always be the only GUI and it is highly unlikely that significant new requirements will ever come in."
"If in the unlikely case of new requirements my code gets too cluttered I still can refactor for the new requirement. So your 'What if you later need to…' argument doesn't count."
What would be your most convincing arguments against such practice? How can I really show that this is an expensive practice, especially to somebody that doesn't have too much experience in software development.
Design is the management and balance of trade-offs. YAGNI and SOLID aren't conflicting: the former says when to add features, the latter says how, but they both guide the design process. My responses, below, to each of your specific quotes use principles from both YAGNI and SOLID.
It is three times as difficult to build reusable components as single use
components.
A reusable component should be tried out in three different
applications before it will be sufficiently general to accept into a reuse
library.
  — Robert Glass' Rules of Three, Facts and Fallacies of Software Engineering
Refactoring into reusable components has the key element of first finding the same purpose in multiple places, and then moving it. In this context, YAGNI applies by inlining that purpose where needed, without worrying about possible duplication, instead of adding generic or reusable features (classes and functions).
The best way, in the initial design, to show when YAGNI doesn't apply is to identify concrete requirements. In other words, do some refactoring before writing code to show that duplication is not merely possible, but already exists: this justifies the extra effort.
Yes, I can put all my business logic directly into the GUI code it is much easier and quicker. This will always be the only GUI and it is highly unlikely that signifcant new requirements will ever come in.
Is it really the only user interface? Is there a background batch mode planned? Will there ever be a web interface?
What is your testing plan, and will you be testing back-end functionality without a GUI? What will make the GUI easy for you to test, since you usually don't want to be testing outside code (such as platform-generic GUI controls) and instead concentrate on your project.
It is OK that I put both feature X and feature Y into the same class. It is so simple why bother adding a new class (i.e. complexity).
Can you point out a common mistake that needs to be avoided? Some things are simple enough, such as squaring a number (x * x vs squared(x)) for an overly-simple example, but if you can point out a concrete mistake someone made—especially in your project or by those on your team—you can show how a common class or function will avoid that in the future.
If, in the unlikely case of new requirements, my code gets too cluttered I still can refactor for the new requirement. So your "What if you later need to..." argument doesn't count.
The problem here is the assumption of "unlikely". Do you agree it's unlikely? If so, you're in agreement with this person. If not, your idea of the design doesn't agree with this person's—resolving that discrepancy will solve the problem, or at least show you where to go next. :)
I like to think about YAGNI in terms of "half, not half-assed", to borrow the phrase from 37signals (https://gettingreal.37signals.com/ch05_Half_Not_Half_Assed.php). It's about limiting your scope so you can focus on doing the most important things well. It's not an excuse to get sloppy.
Business logic in the GUI feels half-assed to me. Unless your system is trivial, I'd be surprised if your business logic and GUI haven't already changed independently, several times over. So you should follow the SRP ("S" in SOLID) and refactor - YAGNI doesn't apply, because you already need it.
The argument about YAGNI and unnecessary complexity absolutely applies if you're doing extra work today to accommodate hypothetical future requirements. When those "what if later we need to..." scenarios fail to materialize, you're stuck with higher maintenance costs from the abstractions that now get in the way of the changes you actually have. In this case, we're talking about simplifying the design by limiting scope -- doing half, rather than being half-assed.
It sounds like you're arguing with a brick wall. I'm a big fan of YAGNI, but at the same time, I also expect that my code will always be used in at least two places: the application, and the tests. That's why things like business logic in UI code don't work; you can't test business logic separate of UI code in that circumstance.
However, from the responses you're describing, it sounds like the person is simply uninterested in doing better work. At that point, no principle is going to help them; they only want to do the minimum possible. I'd go so far as to say that it's not YAGNI driving their actions, but rather laziness, and you alone aren't going to beat laziness (almost nothing can, except a threatening manager or the loss of a job).
There is no answer, or rather, there is an answer neither you nor your interlocutor might like: both YAGNI and SOLID can be wrong approaches.
Attempting to go for SOLID with an inexperienced team, or a team with tight delivery objectives pretty much guarantees you will end up with an expensive, over-engineered bunch of code... that will NOT be SOLID, just over-engineered (aka welcome to the real-world).
Attempting to go YAGNI for a long term project and hope you can refactor later only works to an extent (aka welcome to the real-world). YAGNI excels at proof-of-concepts and demonstrators, getting the market/contract and then be able to invest into something more SOLID.
You need both, at different points in time.
The correct application of these principles is often not very obvious and depends very much on experience. Which is hard to obtain if you didn't do it yourself. Every programmer should have had experiences of the consequences of doing it wrong, but of course it always should be "not my" project.
Explain to them what the problem is, if they don't listen and you're not in a position to make them listen, let them do the mistakes. If you're too often the one having to fix the problem, you should polish your resume.
In my experience, it's always a judgment call. Yes, you should not worry about every little detail of your implementation, and sometimes sticking a method into an existing class is an acceptable, though ugly solution.
It's true that you can refactor later. The important point is to actually do the refactoring. So I believe the real problem is not the occasional design compromise, but putting off refactoring once it becomes clear there's a problem. Actually going through with it is the hard part (just like with many things in life... ).
As to your individual points:
It is OK that I put both feature X
and feature Y into the same class. It
is so simple why bother adding a new
class (i.e. complexity).
I would point out that having everything in one class is more complex (because the relationship between the methods is more intimate, and harder to understand). Having many small classes is not complex. If you feel the list is getting to long, just organize them into packages, and you'll be fine :-). Personally, I have found that just splitting a class into two or three classes can help a lot with readability, without any further change.
Don't be afraid of small classes, they don't bite ;-).
Yes, I can put all my business logic
directly into the GUI code it is much
easier and quicker. This will always
be the only GUI and it is highly
unlikely that signifcant new
requirements will ever come in.
If someone can say "it is highly unlikely that signifcant new requirements will ever come in." with a straight face, I believe that person really, really needs a reality check. Be blunt, but gentle...
If in the unlikely case of new
requirements my code gets too
cluttered I still can refactor for the
new requirement. So your 'What if you
later need to ...' argument doesn't
count
That has some merit, but only if they actually do refactor later. So accept it, and hold them to their promise :-).
SOLID principles allow software to adapt to change - in both requirements and techical changes (new components, etc), two of your arguments are for unchanging requirements:
"it is highly unlikely that signifcant new requirements will ever come in."
"If in the unlikely case of new requirements"
Could this really be true?
There is no substitute for experience when it comes to the various expenses of development. For many practitioners I think doing things in the lousy, difficult to maintain way has never resulted in problems for them (hey! job security). Over the long term of a product I think these expenses become clear, but doing something about them ahead of time is someone else's job.
There are some other great answers here.
Understandable, flexible and capable of fixes and improvements are always things that you are going to need. Indeed, YAGNI assumes that you can come back and add new features when they prove necessary with relative ease, because nobody is going to do something crazy like bunging irrelevant functionality in a class (YAGNI in that class!) or pushing business logic to UI logic.
There can be times when what seems crazy now was reasonable in the past - sometimes the boundary lines of UI vs business or between different sets of responsibilities that should be in a different class aren't that clear, or even move. There can be times when 3hours of work is absolutely necessary in 2hours time. There are times when people just don't make the right call. For those reasons occasional breaks in this regard will happen, but they are going to get in the way of using the YAGNI principle, not be a cause of it.
Quality unit tests, and I mean unit tests not integration tests, need code that adheres to SOLID. Not necessarily 100%, in fact rarely so, but in your example stuffing two features into one class will make unit testing harder, breaks the single responsibility principle, and makes code maintenance by team newbies much harder (as it is much harder to comprehend).
With the unit tests (assuming good code coverage) you'll be able to refactor feature 1 safe and secure you won't break feature 2, but without unit tests and with the features in same class (simply to be lazy in your example) refactoring is risky at best, disastrous at best.
Bottom line: follow the KIS principle (keep it simple), or for the intellectual the KISS principle (kis stupid). Take each case on merit, there's no global answer but always consider if other coders need to read / maintain the code in the future and the benefit of unit tests in each scenario.
tldr;
SOLID assumes, you understand (somewhat atleast), the future changes to the code, wrt SRP. I will say that is being optimistic about capability to predict.
YAGNI on the other hand, assumes most of the times you don't know future direction of change, which is pessimistic about capability to predict.
Hence it follows that SOLID/SRP asks you to form classes for the code such that it will have single reason for change. E.g. a small GUI change or ServiceCall change.
YAGNI says (if you want to force apply it in this scenario), since you don't know WHAT is going to change, and if a GUI change will cause a GUI+ServiceCall change (similarly A backend change causing GUI+SeviceCall change), just put all that code in single class.
Long answer :
Read the book 'Agile Software Development, Principles, Patterns, and Practices'
I am putting short excerpt from it about SOLID/SRP :
"If,[...]the application is not changing in ways that cause the two responsibilities to change at different times, there is no need to separate them. Indeed, separating them would smell of needless complexity.
There is a corrolary here. An axis of change is an axis of change only if the changes occur. It is not wise to apply SRP—or any other principle, for that matter—if there is no symptom."

Ravioli code - why an anti-pattern? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I recently came across a term 'God object' which was described as an 'anti-pattern'. I'd heard of bad coding practices, but I'd never heard them described thus.
So I headed off to Wikipedia to find out more, and I found that there is an anti-pattern called 'Ravioli code' which is described as being "characterized by a number of small and (ideally) loosely-coupled software components."
I'm puzzled - why is this a bad thing?
Spaghhetti:
Spaghetti code is a pejorative term
for source code
Ravioli:
Ravioli code is a type of computer
program structure, characterized by a
number of small and (ideally)
loosely-coupled software components.
The term is in comparison with
spaghetti code, comparing program
structure to pasta;
It's comparing them. It isn't saying it's an anti-pattern.
But I agree. The article is confusing. The problem is not with ravioli analogy but with the wikipedia article structure itself. It starts calling Spaghetti as a bad practice, then says some examples and after that say something about Ravioli code.
EDIT: They improved the article. Is an anti-pattern because
While generally desirable from a coupling and cohesion perspective, overzealous separation and encapsulation of code can bloat call stacks and make navigation through the code for maintenance purposes more difficult.
I'd say it's pretty obvious Ravioli code and Lasagna code are both pejorative terms - being intentionally placed alongside their fellow pasta simile 'spaghetti code' - and both terms describe real world anti-patterns. Some code is very time-consuming to maintain and very prone to failure simply because it is broken down into so many separate sub-processes - that is ravioli code. Some code has so many layers of abstraction that it becomes very difficult to implement a change to it and/or understand at what level a failure is occurring - that is lasagna code. The only practical way to make a change to lasagna code is often to simply bypass the layers and write the straightforward code that does the job. I have to maintain some ravioli code, but in general such code suffers from its convolution and fails to find widespread use, so there are few examples of it that we would all be familiar with. By contrast, lasagna code is everywhere at the moment.
I like to think I don't write any pasta code myself, but you can at least follow a string of spaghetti...
It's listed in the page of Spaghetti code but that doesn't mean it's a bad thing. It's there because this is a relevant term and not important enough to have its own page.
Regarding the bad side of it, Googling gives a comment in http://developers.slashdot.org/comments.pl?sid=236721&cid=19330355:
The problem is that it tends to lead to functions (methods, etc.) without true coherence, and it often leaves the code to implement even something fairly simple scattered over a very large number of functions. Anyone having to maintain the code has to understand how all the calls between all the bits work, recreating almost all the badness of Spaghetti Code except with function calls instead of GOTO. ...
You gotta judge if it's reasonable :).
Pretty much the only reason "ravioli code" has survived as a phrase is because programmers have an innate sense of humor. Try as I might - and believe me, I've tried - it's really hard to come up with an example of object oriented code that was both (a) packaged such that it was really hard to navigate in the same meta-sense that "spaghetti code" is hard to navigate, and (b) reflected a frequent anti-pattern in coding practice.
The best example of "ravioli code" I can come up with is a multitude of classes, each tightly packaged, but where it's really hard to dig out where the main flow of execution is. Neural network applications might exhibit this, but that's sort of the point. A more mundane example would be UI code that is very heavily event-oriented, but again, it's hard to go overboard with that - if anything, most UI code isn't event-driven enough.
The problem is that different people use the term "ravioli code" to mean different things.
A reasonable number of reasonably small components is good. A huge pile of tiny components with no apparent overall structure is not so good.
Loosely coupled components are good. Components that hide their interdependencies in order to look loosely coupled are not so good.
Being able to see the relationship between different components is actually a good thing.
Most code bases have the opposite problem though, so moving towards more modularity is usually a good thing. I expect most folks have never even seen "ravioli code" in the bad sense. In practice, it tends to look more like chopped ravioli (where what should be a module is split across multiple "modules" -- none of which make sense on their own, but only in combination with their corresponding other parts), or like ravioli cooked without enough water (so you end up with a giant blob of "modules" all stuck together).
TODO: Write a "Hello world" program as ~100 modules to demonstrate overmodularity.
TODO2: Attempt to build a bridge out of truckloads of ravioli to demonstrate suboptimal structural characteristics.
If you apply a dogmatic rule that all classes in all projects must be loosely coupled regardless of any reason, then I can see there being a lot of potential problems.
You could spin your wheels trying to make a perfectly fine application more and more and more loosely coupled without ever actually adding any value to it.
Let me hasten to add, though, that I think that we should all aim towards loosely coupled classes, components, etc
It's not necessarily, is it? The Wikipedia article doesn't describe it as bad. A number of loosely-coupled software components, where the size and number of these components is sensible in relation to the problem domain, sounds pretty ideal to me.
In fact, if you look up other definitions of ravioli code, you'll find it described as the ideal software design pattern - I still prefer the caveat that the size and number need to be appropriate.
Reading the article, Spaghetti is an anti-pattern; but Ravioli and Lasagna are not anti-patterns.

Does anyone else have the feeling that solutions for simple projects are often overengineered? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Somehow I've got the feeling that many projects become heavily overengineered so every possible change-request can be tackled with the effect that the change-requests that occured are very hard to implement.
Somehow I get that feeling in nearly every project I'm currently working on. It is like everyone is thinking "which cool api, framework, etc. can we add to the project to tackle this and that aspect" without evaluating if it is practical or needed.
Does anyone else feel the same or what's the opinion of the community here?
Yes.
-- MarkusQ
I find that, while some older companies with 'senior' senior management tend to be extremely rigid with how their business code is created, newer companies completely lack a backbone of what software they use to get the job done.
The problem you describe sounds like people are viewing problems at too high a level and want to find a way to solve it in one go. Creating a working toolbox (think standard libraries) would help them out in the long run.
I particularly enjoy the UNIX way of things: Several tiny utilities that do one thing and do it really well.
Definitely - I think people get 'robust and modular' and 'overengineered' mixed up far too often.
I think Dave Winer captured the cyclical nature of this phenomenon well:
The trick in each cycle is to fight
complexity, so the growth can keep
going. But you can't keep it out,
engineers like complexity, not just
because it provides them job security,
also because they really just like it.
But once the stack gets too arcane,
the next generation throws their hands
up and says "We're not going to deal
with that mess."
Guilty as charged. Three levels of abstraction and six config files is just more fun to implement than a simple flow control branch.
Overengineering is a danger in any project, large or small. When writing for code extensibility, there's always a trade-off between writing code that is sufficiently generic and extensible to allow for future development and code that is specific enough to make the task at hand easy.
Every "future hook" has a cost and should be evaluated in the light of that cost, the probability that it will ever be used, and the cost of refactoring later in the game. "More generic" is not always better.
As for using a hot, new framework or API, I think project managers should take a gentle hand in this. With development being such a fast-moving field, hands-on self-training is part of the price of doing business (but obviously should be kept reasonable.)
I've only encountered that with Java people: "Lets use spring!" And hibernate! I found this cool validation package as well! And this one will create XSLT to create XML forms to create the javascript!"
By the time I find all the jars and get them to play nice, there are dozens of classes I have to be familiar with. And then they want to abstract away all those pieces! "We might switch spring out. Or not use hibernate, so we should abstract those away". Add another dozen hacky wrapper classes.
By the time its all done, 50 lines of psuedo code has turned into several thousand lines of making the "cool stuff" work, and about 100 lines of business logic trapped within its hairy, hacky, bug-ridden hell.
I'm guilty of it myself, too, but thats only because I'm bored and have time to kill.
The key thing to keep in mind here is time. A "simple" project that is knocked out of the park on day one by an Excel spreadsheet or a web page can quickly expand in both its audience and scope to become an unsustainable monster.
Under-engineering is a danger, too. The world is full of "practical" people who will mock any efforts that are contrary to their opinion. Nobody remembers the person who disagreed a year later when the "simple" solution can't be sustained.
The trick is to balance the right degree of engineering and be able to adjust when things change.
I've been bitten a few too many times by applications that I've thrown together quickly which have subsequently become "critical" apps. Then I have to almost rewrite the darned thing. I just assume now that it will become critical and so I "write it right" the first time. Saves me time and effort in the long term. So oddly, it's about laziness.
Today's xkcd is the absolute truth:
I find that overengineering is a byproduct of boredom. I believe Jeff and Joel covered this in a podcast, but the idea is that coders who often overengineer may just be in a rut and need a change of pace (Jeff and Joel suggested that they be allowed to do different jobs like QA).
Yes. I think people nowadays think more about how something should be done, and whether "it's the right way ..." and so on, than just picking the simple solution which will equally get the job done. If you're not asked to expand it, then don't think about expanding it.
I think this is a problem, but not as big as it seems and that there are good reasons for it.
The problem is that most projects that are underengineered will die a quick death while the overengineered ones can survive. Thus, when you look at still living project, there is survivorship bias.
And, even if you are a good architect that aims for "just good enough", if in doubt you will use the more flexibe, scalable, ... (i.e., overengineered) solution. Because the failure mode if you do not meet the requirements (whatever they may be) is usually much worse than what happens if you exceed them.
I'm yet to work on any real world problem but I've read enough people blogging about this to assume there is a problem.
Absolutely. But it's not just developers, it's users too. "I want this and that and the other in order to improve communication and increase productivity!".
I'm amazed at how many of these types of projects we've solved by just putting a shared folder on a file server.
As others have said, I've had too many 'small' projects become big ones and the short cuts taken become paint.
There is a place for a quick-and-dirty solution and in trying to follow the YAGNI mantra, I've created simple apps with no engineering.
With that though, you are not going to be able to jump into a million line system and develop the 'well engineered' system without practice on smaller systems.
I've taken to always following the developed architecture of our company in all projects because it helps me to work out solutions in line with the engineering principles we are trying to follow.
What you practice, you perform, so always do your best.