Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I am converting an open source Java library to C#, which has a number of methods and classes tagged as deprecated. This project is an opportunity to start with a clean slate, so I plan to remove them entirely. However, being new to working on larger projects, I am nervous that the situation will arise again. Since much of agile development revolves around making something work now and refactoring later if needed, it seems like deprecation of APIs must be a common problem. Are there preventative measures I can take to avoid/minimize API deprecation, even if I am not entirely sure of the future direction of a project?
I'm not sure there is much you can do. Requirements change, and if you absolutely have to make sure that clients of the API are not broken by newer API version, you'll have rely on simply deprecating code until you think that no-one is using the deprecated code.
Placing [Obsolete] attributes on code causes the compiler to create warnings if there are any references to the obsolete methods. This way clients of the API, if they are diligent about fixing their compiler warnings, can gradually move to the new methods without having everything break with the new version.
Its useful if you use the ObsoleteAttribute's override which takes a string:
[Obsolete("Foo is deprecated. Use Bar instead for munging widgets.")]
<frivolous>
Perhaps you could create a TimeBombAttribute:
[TimeBomb(new DateTime(2010,1,1), "Foo will blow up! Better use Bar, or else."]
In your code, reflect for methods with the timebomb attribute and throw KaboomException if they are called after the specified date. That'll make sure that after 1st January 2010 no-one is using the obsolete methods, and you can clean up your API nicely. :)
</frivolous>
As Matt says, the Obsolete attribute is your friend... but whenever you apply it, provide details of how to change calling code. That way you've got a lot better chance of people actually changing. You might also want to consider specifying which version you anticipate removing the method in (probably the next major release).
Of course, you should be diligent in making sure you don't call the obsolete code - particularly in sample code.
Since much of agile development revolves around making something work now and refactoring later if needed
That's not agile. It's cowboy coding disguised under the label of agile.
The ideal is that whatever you complete, is complete, according to whatever Definition of Done you have. Usually the DoD states something along the lines of "feature impelmented, tested and related code refactored". Of course, if you are working on a throwaway prototype, you can have a more relaxed DoD.
API modifications are a difficult beast. If they are only project-internal APIs you are modifying, the best way to go is to refactor early. If you need to change the internal API, just go ahead and change all API clients at the same time. This way the refactoring debt does not grow very large and you don't have to use deprecation.
For published APIs you probably have some source and binary compatibility guarantees you have to maintain, at least until the next major release or so. Marking the old APIs deprecated works while maintaining compatibility. As with internal APIs, you should fix your internal code as soon as possible to not use the deprecated APIs.
Matt's answer is solid advice. I just wanted to mention that intially you probably want to use something along the lines of:
[Obsolete("Please use ... instead ", false)]
Once you have the code ported, change the false to true and the compiler will then treat all the calls to the method as an error.
Watch Josh Bloch's "How to Design a Good API and Why It Matters"
Most important w/r/t deprecation is knowing that "when in doubt, leave it out." Watch the video for clarification, but it has to do with having to support what you provide forever. If you are realistically expecting that API to be reused, you're effectively setting your decisions in stone.
I think API design is a much trickier thing to do in an Agile fashion because you're expecting it to be reused probably in many different ways. You have to worry about breaking others that are dependent on you, and so while it can be done, it's tough to have the right design emerge without getting a quick turnaround from other teams. Of course deprecation is going to help here, but I think YAGNI is a lot better design heuristic when it comes to APIs.
I think deprecation of code is an inevitable byproduct of Agile processes like continuous refactoring and incremental development. So if you end up with deprecated code as you work on your project, that's not necessarily a bad thing--just a fact of life. Of course, you will probably find that, rather than deprecating code, you end up keeping a lot of code but refactoring it into different methods, classes, and so on.
So, bottom line: I wouldn't worry about deprecating code during Agile development. If it served its purpose for a while, you're doing the right thing.
The rule of thumb for API design is to focus on what it does, rather than how it does it. Once you know the end goal, figure out the absolute minimum input you need and use that. Avoid passing your own objects as parameters, pass only data.
Seperate configuration from execution. For exmaple, maybe you have an image encoder/decoder.
Instead of making a call like:
Encoder.Encode( bytes, width, height, compression_type, compression_ratio, palette, etc etc);
Make it
Encoder.setCompressionType(compression_type);
Encoder.setCompressionType(compression_ratio);
etc,etc
Encoder.Encode(bytes, width, height);
That way adding or removing settings is much less likely to break existing implementations.
For deprecation, there's basically 3 types of APIs: internal, external, and public.
Internal is when its only your team working on the code. Deprecating these APIs isn't a big deal. Your team is the only one using it, so they aren't around long, there's pressure to change them, people aren't afraid to change them, and people know how to change them.
External is when its the same code base, but different teams are using it. This might be some common libraries in a large company, or a popular open source library. The point is, people can choose the version of code they compile with. The ease of deprecating an API depends on the size of the organization and how well they communicate. IMO, its the deprecator's job to update old code, rather than mark it deprecated and let warnings fly throughout the code base. Why the deprecator instead of the deprecatee? Because the depcarator is in the know; they know what changed and why.
Those two cases are pretty easy. So long as there is backwards compatibility, you can generally do whatever you'd like, update the clients yourself, or convince the maintainers to do it.
Then there are public api's. These are basically external API's that the clients don't have much control over, such as a web API. These are incredibly hard to update or deprecate. Most won't notice its broken, won't have someone to fix it, won't get notifications that its changing, and will only fix it once its broken (after they've yelled at you for breaking it, over course).
I've had to do the above a few times, and it is such a chore. I think the best you can do is purposefully break it early, wait a bit, and then restore it. You send out the usual warnings and deprecations first, of course, but - trust me - nothing will happen until something breaks.
An idea I've yet to try is to let people register simple apps that run small tests. When you want to do an API update, you run the external tests and contact the affected people.
Another approach to be popular is to have clients depend on (web) services. There are constructs out there that allow you to version your services and allow clients to perform lookups. This adds a lot more moving parts and complexity into the equation, but can be helpful if you are looking at turning over a lot of versions, and having to support multiple versions in production.
This article does a good job of explaining the problem and an approach.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I saw this ReasonML vs TypeScript question here at StackOverflow, now I'm wondering how ReasonML and Elm compare to each other.
What are their similarities and differences? Which one should I use when?
What's the advantage of one over the other?
I'm not intimately familiar with Elm, but I've looked into it a bit and I'm pretty familiar with Reason, so I'll give it a shot. I'm sure there will be inaccuracies here though, so please don't take anything I say as fact, but use it instead as pointers for what to look into in more detail yourself if it matters to you.
Both Elm and Reason are ML-like languages with very similar programming models, so I'll focus on the differences.
Syntax:
Elm uses a Haskell-like syntax that is designed (and/or evolved) for the programming model both Elm and Reason uses, so should work very well for reading and writing idiomatic code once you`re familiar with it, but will seem very different and unfamiliar to most programmers.
Reason tries to be more approachable by emulating JavaScript's syntax as much as possible, which will be familiar to most programmers. However it also aims to support the entire feature set of the underlying OCaml language, which makes some functional patterns quite awkward.
One example of this is the function application syntax, which in Elm emphasizes the curried nature of functions (f a b) and works very well for composing functions and building readable DSLs. Reason's parenthesized syntax (f(a, b)) hides this complexity, which makes it easier to get into (until you accidentally trip on it, since it's of course still different underneath), but makes heavy use of function composition a mess of parentheses.
Mutability:
Elm is a purely functional language, which is great in theory but challenging in practice since the surrounding world cares little about Elm's quest for purity. Elm's preferred solution to this, I think, is to isolate the impurity by writing the offending code in JavaScript instead, and to then access it in Elm through either web components or ports. This means you might have to maintain significant amounts of code in a separate and very unsafe language, quite a bit of boilerplate to connect them, as well as having to figure out how to fit the round things though the square holes of ports and such in the first place.
Reason on the other hand is... pragmatic, as I like to call it. You sacrifice some safety, ideals and long-term benefits for increased productivity and short-term benefits. Isolating impurity is still good practice in Reason, but you're inevitably going to take short-cuts just to get things done, and that is going to bite you later.
But even if you do manage to be disciplined enough to isolate all impurity, you still have to pay a price to have mutation in the language. Part of that price is what's called the value restriction, which you're going to run into sooner or later, and it's going to both confuse and infuriate you, since it will reject code that intuitively should work, just because the compiler is unable to prove that there can`t at some point be a mutable reference involved.
JavaScript interoperability:
As mentioned above, Elm provides the ability to interoperate with JavaScript through ports and web components, which are deliberately quite limited. You used to be able to use native modules, which offered much more flexibility (and ability to shoot yourself in the foot), but that possibility is going away (for the plebs at least), a move that has not been uncontroversial (but also shouldn't be all that surprising given the philosophy). Read more about this change here
Reason, or rather BuckleScript, provides a rich set of primitives to be able bind directly to JavaScript, and very often produce an idiomatic Reason interface without needing to write any glue code. And while not very intuitive, it's pretty easy to do once you grok it. It's also easy to get it wrong and have it blow up in your face at some random point later, however. Whatever glue code you do have to write to provide a nice idiomatic API can be written in Reason, with all its safety guarantees, instead of having to write unsafe JavaScript.
Ecosystem:
As a consequence of Elm's limited JavaScript interoperability, the ecosystem is rather small. There aren't a whole lot of good quality third party JavaScript libraries that provide web components, and doing it yourself takes a lot of effort. So you'll instead see libraries being implemented directly in Elm itself, which takes even more effort, of course, but will often result in higher quality since they're specifically designed for Elm.
Tooling:
Elm is famous for its great error messages. Reason to a large degree does not, though it strives to. This is at least partly because Reason is not itself a compiler but instead built on top of the OCaml compiler, so the information available is limited, and the surface area of possible errors very large. But they're also not as well thought through.
Elm also has a great packaging tool which sets everything up for you and even checks whether the interface of a package you're publishing has changed and that the version bump corresponds to semantic versioning. Resaon/BuckleScript just uses npm and requires you to manage everything Reason/BuckleScript-specific manually like updating bsconfig.json with new dependencies.
Reason, BuckleScript, its build system, and OCaml are all blazing fast though. I've yet to experience any project taking more than 3 seconds to compile from scratch, including all dependencies, and incremental compilation usually takes only milliseconds (though this isn't entirely without cost to user-friendliness). Elm, as I understand it, is not quite as performant.
Elm and Reason both have formatting tools, but Reason-formatted code is of significantly poorer quality (though slowly improving). I think this is largely because of the vastly more complex syntax it has to deal with.
Maturity and decay:
Reason, being built on OCaml, has roots going back more than 20 years. That means it has a solid foundation that's been battle-tested and proven to work over a long period of time. Furthermore, it's a language largely developed by academics, which means a feature might take a while to get implemented, but when it does get in it's rock solid because it's grounded in theory and possibly even formally proven. On the downside, its age and experimental nature also means it's gathered a bit of cruft that`s difficult to get rid of.
Elm on the other hand, being relatively new and less bureaucratically managed, can move faster and is not afraid of breaking with the past. That makes for a slimmer and more coherent, but also has a less powerful type system.
Portability:
Elm compiles to JavaScript, which in itself is quite portable, but is currently restricted to the browser, and even more so to the Elm Architecture. This is a choice, and it wouldn't be too difficult to target node or platforms. But the argument against it is, as I understand it, that it would divert focus, thereby making it less excellent at its niche
Reason, being based on OCaml, actually targets native machine code and bytecode first and foremost, but also has a JavaScript compiler (or two) that enables it to target browsers, node, electron, react native, and even the ability to compile into a unikernel. Windows support is supposedly a bit sketchy though. As an ecosystem, Reason targets React first and foremost, but also has libraries allowing the Elm Architecture to be used quite naturally
Governance:
Elm is designed and developed by a single person who's able to clearly communicate his goals and reasoning and who's being paid to work on it full-time. This makes for a coherent and well-designed end product, but development is slow, and the bus factor might make investment difficult.
Reason's story is a bit more complex, as it's more of an umbrella name for a collection of projects.
OCaml is managed, designed and developed in the open, largely by academics but also by developers sponsored by various foundations and commercial backers.
BuckleScript, a JavaScript compiler that derives from the OCaml compiler, is developed by a single developer whose goals and employment situation is unclear, and who does not bother to explain his reasoning or decisions. Development is technically more open in that PRs are accepted, but the lack of explanation and obtuse codebase makes it effectively closed development. Unfortunately this does not lead to a particularly coherent design either, and the bus factor might make investment difficult here as well.
Reason itself, and ReasonReact, is managed by Facebook. PRs are welcomed, and a significant amount of Reason development is driven by outsiders, but most decisions seem to be made in a back room somewhere. PRs to ReasonReact, beyond trivial typo fixes and such, are often rejected, probably for good reason but usually with little explanation. A better design will then typically emerge from the back room sometime later.
Greetings!
I inherited a C#.NET application I have been extending and improving for a while now. Overall it was obviously a rush-job (or whoever wrote it was seemingly less competent than myself). The app pulls some data from an embedded device & displays and manipulates it. At the core is a communications thread in the main application form which executes a 600+ lines of code method which calls functions all over the place, implementing a state machine - lots of if-state-then-do type code. Interaction with the device is done by setting the state/mode globally and letting the thread do it's thing. (This is just one example of the badness of the code - overall it is not very OO-like, it reminds of the style of embedded C code the device firmware is written in).
My problem is that this piece of code is central to the application. The software, communications protocol or device firmware are not documented at all. Obviously to carry on with my work I have to interact with this code.
What I would like some guidance on, is whether it is worth scrapping this code & trying to piece together something more reasonable from the information I can reverse engineer? I can't decide! The reason I don't want to refactor is because the code already works, and changing it will surely be a long, laborious and unpleasant task. On the flip side, not refactoring means I have to sometimes compromise the design of other modules so that I may call my code from this state machine!
I've heard of "If it ain't broke don't fix it!", so I am wondering if it should apply when "it" is influencing the design of future code! Any advice would be appreciated!
Thanks!
Also, the longer you wait, the worse the codebase will smell. My suggestion would be first create a testsuite that you can evaluate your refactoring against. This makes it a lot easier to see if you are refactoring or just plain breaking things :).
I would definitely recommend you to refactor the code if you feel its junky. Yes, during the process of refactoring you may have some inconsistencies/problems at the start. But that is why we have iterations and testing. Since you are going to build up on this core engine in future, why not make the basement as stable as possible.
However, be very sure on what you are going to do. Because at times long lines of code does not necessarily mean evil. On the other hand they may be very efficient in running time. If/else blocks are not bad if you ask me, as they are very intelligent in branching from a microprocessor's perspective. So, you will have to be judgmental and very clear before you touch this.
But once you refactor the code, you will definitely have fine control over it. And don't forget to document it!! Tomorrow, someone might very well come and say about you on whatever you've told about this guy who have written that core code.
This depends on the constraints you are facing, it's a decision to be based on practical basis, not on theoretical ones. You need three things to consider.
Time: you need to have enough time to learn it, implement it, and test it, without too many other tasks interrupting you
Boss #1: if you are working for someone, he needs to know and approve the time and effort you will spend immediately, required to rebuild your solution
Boss #2: your boss also needs to know that the advantage of having new and clean software will come at the price of possible regressions, and therefore at the beginning of the deployment there may be unexpected bugs
If you have those three, then go ahead and refactor it. It will be surely be worth it!
First and foremost, get all the business logic out of the Form. Second, locate all the parts where the code interacts with the global state (e.g. accessing the embedded system). Delegate all this access to methods. Then, move these methods into a new class and create an instance in the class's constructor. Finally, inject an instance for the class to use.
Following these steps, you can move your embedded system logic ("existing module") to a wrapper class you write, so the interface can be nice and clean and more manageable. Then you can better tackle refactoring the monster method because there is less global state to worry about (only local state).
If the code works and you can integrate your part with minimal changes to it then let the code as it is and do your integration.
If the code is simply a big barrier in your way to add new functionality then it is best for you to refactor it.
Talk with other people that are responsible for the project, explain the situation, give an estimation explaining the benefits gained after refactoring the code and I'm sure (I hope) that the best choice will be made. It is best to speak about what you think, don't keep anything inside, especially if this affects your productivity, motivation etc.
NOTE: Usually rewriting code is out of the question but depending on situation and amount of code needed to be rewritten the decision may vary.
You say that this is having an impact on the future design of the system. In this case I would say it is broken and does need fixing.
But you do have to take into account the business requirements. Often reality gets in the way!
Would it be possible to wrap this code up in another class whose interface better suits how you want to take the system forward? (See adapter pattern)
This would allow you to move forward with your requirements without the poor design having an impact.
It gives you an interface that you understand which you could write some unit tests for. These tests can be based on what your design requires from this code. It ensures that your assumptions about what it is doing is correct. If you say that this code works, then any failing tests may be that your assumptions are incorrect.
Once you have these tests you can safely refactor - one step at a time, and when you have some spare time or when it is needed - as per business requirements.
Quite often I find the best way to truly understand a piece of code is to refactor it.
EDIT
On reflection, as this is one big method with multiple calls to the outside world, you are going to need some kind of inverse Adapter class to wrap this method. If you can inject dependencies into the method (see Dependency Inversion such that the method calls methods in your classes then you can route these to the original calls.
This is more general question then language-specific, altho I bumped into this problem while playing with python ncurses module. I needed to display locale characters and have them recognized as characters, so I just quickly monkey-patched few functions / methods from curses module.
This was what I call a fast and ugly solution, even if it works. And the changes were relativly small, so I can hope I haven't messed up anything. My plan was to find another solution, but seeing it works and works well, you know how it is, I went forward to other problems I had to deal with, and I'm sure if there's no bug in this I won't ever make it better.
The more general question appeared to me though - obviously some languages allow us to monkey-patch large chunks of code inside classes. If this is the code I only use for myself, or the change is small, it's ok. What if some other developer takes my code though, he sees that I use some well-known module, so he can assume it works as it's used to. Then, this method suddenly behaves diffrent then it should.
So, very subjective, should we use monkey patching, and if yes, when and how? How should we document it?
edit: for #guerda:
Monkey-patching is the ability to dynamicly change the behavior of some piece of code at the execution time, without altering the code itself.
A small example in Python:
import os
def ld(name):
print("The directory won't be listed here, it's a feature!")
os.listdir = ld
# now what happens if we call os.listdir("/home/")?
os.listdir("/home/")
Don't!
Especially with free software, you have all the possibilities out there to get your changes into the main distribution. But if you have a weakly documented hack in your local copy you'll never be able to ship the product and upgrading to the next version of curses (security updates anyone) will be very high cost.
See this answer for a glimpse into what is possible on foreign code bases. The linked screencast is really worth a watch. Suddenly a dirty hack turns into a valuable contribution.
If you really cannot get the patch upstream for whatever reason, at least create a local (git) repo to track upstream and have your changes in a separate branch.
Recently I've come across a point where I have to accept monkey-patching as last resort: Puppet is a "run-everywhere" piece of ruby code. Since the agent has to run on - potentially certified - systems, it cannot require a specific ruby version. Some of those have bugs that can be worked around by monkey-patching select methods in the runtime. These patches are version-specific, contained, and the target is frozen. I see no other alternative there.
I would say don't.
Each monkey patch should be an exception and marked (for example with a //HACK comment) as such so they are easy to track back.
As we all know, it is all to easy to leave the ugly code in place because it works, so why spend any more time on it. So the ugly code will be there for a long time.
I agree with David in that monkey patching production code is usually not a good idea.
However, I believe that for languages that support it, monkey patching is a very valuable tool for unit testing. It allows you to isolate the piece of code you need to test even when it has complex dependencies - for instance with system calls that cannot be Dependency Injected.
I think the question can't be addressed with a single definitive yes-no/good-bad answer - the differences between languages and their implementations have to be considered.
In Python, one needs to consider whether a class can be monkey-patched at all (see this SO question for discussion), which relates to Python's slightly less-OO implementation. So I'd be cautious and inclined to expend some effort looking for alternatives before monkey-patching.
In Ruby, OTOH, which was built to be OO down into the interpreter, classes can be modified irrespective of whether they're implemented in C or Ruby. Even Object (pretty much the base class of everything) is open to modification. So monkey-patching is rather more enthusiastically adopted as a technique in that community.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
One thing I've always found frustrating is when a library I use is no longer maintained. Even looking at update history and community beforehand, I've run into the situation where I check back later to find that the version I'm using is the last version.
Generally this goes unnoticed until a few months have passed, or some bug/limitation has been found. I run into this fairly often when coding in Python, because my desire to upgrade to a new version of the interpreter can easily introduce problems in libraries that worked fine before. My question is: what is the best response to this situation?
Do you become the maintainer of the old library? Even if you're only fixing the bugs you care about, this is still a lot of work. Especially if the library is large, complex, and has less-than-well-documented code (the case more often than not).
Do you switch to a different library (if there is one)? This is also a significant undertaking, with the potential to introduce new bugs, especially if the only alternatives approach the problem from a different angle. This can be true even if you had the foresight to write an abstraction layer for the old library's functionality.
Do you roll your own? It probably ends up as less code than the old library, since you only write the parts you care about. It's therefore easier to maintain in the future. But now you've wasted days/weeks/months to produce something that is probably less functional, and is guaranteed to introduce tons of new bugs.
I realize the answer depends on the specific case: the size of the library, whether source is available, how maintainable it is, how much of it your code uses, how deeply your code relies on it, etc. I'm looking for answers across a range of cases. What are your experiences with this problem?
Well, you've found one argument to lessen the number of external dependencies...
I've come across this in several Java projects I've audited; it seems people have a tendency to drop in a Jar found somewhere on the Web for the tiniest amount of reuse possible from it. The result is a mess of dependencies that ends up undermining the code base. I prefer to use external components sparingly.
It's probably most useful to ask what you can do before. Make a point of evaluating the future lifetime of an external component before you start using it. Do some research on how large its developer community and its user community are. Also, prefer to use a component that has one or two "lesser" alternatives which you could also use.
If there's something you're tempted to use, but it has only one or two people working on it and isn't used much beyond their own project, then you should probably roll your own - or join forces with the maintainers of the component.
I think your really answer is in how do you select third party libraries to include in your code.
If you happen to like constantly upgrading your code to the latest version of the language then by default you can only use libraries that have active communities behind them
In fact I would go as far as saying that the only time that you want to use a third party open source library is when the community behind it is large (say at least 40+ users) and it has undergone a few releases.
For a commercial library the same thing applies how long is the company going to be around and how many other clients use it.
If you can't find a library in this position then ensure that you abstract the third party library out of your code so replacement isn't hard in the future.
When the Java EE framework my employer chose went belly up, we went out and found a newer, better one. Fortunately Spring was available.
We prefer to roll our own for that very reason. We end up with full control over it, full knowledge of how it works, and we can change it any way we want. When our ass is on the line when the blame game is played, we prefer to reduce the risk and do it ourselves.
We had a situation once where we did use an external library, and it got rewritten and repurposed by the author and no longer did what we expected. We rolled over that, wrote our own version, and continued safely.
The bottom line is safety, and minimization of risk.
If the source is available, the licence is open and the library does the job really well, you have the option to fork the library. By doing this, you can also add new features to it. If the library has lots of things to fix and the code is a mess, it is better to find something else to work with.
Everyone I work with is obsessed with the data-centric approach to enterprise development and hates the idea of using custom collections/objects. What is the best way to convince them otherwise?
Do it by example and tread lightly. Anything stronger will just alienate you from the rest of the team.
Remember to consider the possibility that they're onto something you've missed. Being part of a team means taking turns learning & teaching.
No single person has all the answers.
If you are working on legacy code (e.g., apps ported from .NET 1.x to 2.0 or 3.5) then it would be a bad idea to depart from datasets. Why change something that already works?
If you are, however, creating a new apps, there a few things that you can cite:
Appeal to experiencing pain in maintaining apps that stick with DataSets
Cite performance benefits for your new approach
Bait them with a good middle-ground. Move to .NET 3.5, and promote LINQ to SQL, for instance: while still sticking to data-driven architecture, is a huge, huge departure to string-indexed data sets, and enforces... voila! Custom collections -- in a manner that is hidden from them.
What is important is that whatever approach you use you remain consistent, and you are completely honest with the pros and cons of your approaches.
If all else fails (e.g., you have a development team that utterly refuses to budge from old practices and is skeptical of learning new things), this is a very, very clear sign that you've outgrown your team it's time to leave your company!
Remember to consider the possibility that they're onto something you've missed. Being part of a team means taking turns learning & teaching.
Seconded. The whole idea that "enterprise development" is somehow distinct from (and usually the implication is 'more important than') normal development really irks me.
If there really is a benefit for using some technology, then you'll need to come up with a considered list of all the pros and cons that would occur if you switched.
Present this list to your co workers along with explanations and examples for each one.
You have to be realistic when creating this list. You can't just say "Saves us lots of time!!! WIN!!" without addressing the fact that sometimes it is going to take MORE time, will require X months to come up to speed on the new tech, etc. You have to show concrete examples where it will save time, and exactly how.
Likewise you can't just skirt over the cons as if they don't matter, your co-workers will call you on it.
If you don't do these things, or come across as just pushing what you personally like, nobody is going to take you seriously, and you'll just get a reputation for being the guy who's full of enthusiasm and energy but has no idea about anything.
BTW. Look out for this particular con. It will trump everything, unless you have a lot of strong cases for all your other stuff:
Requires 12+ months work porting our existing code. You lose.
Of course, "it depends" on the situation. Sometimes DataSets or DataTables are more suited, like if it really is pretty light business logic, flat hierarchy of entities/records, or featuring some versioning capabilities.
Custom object collections shine when you want to implement a deep hierarchy/graph of objects that cannot be efficiently represented in flat 2D tables. What you can demonstrate is a large graph of objects and getting certain events to propagate down the correct branches without invoking inappropriate objects in other branches. That way it is not necessary to loop or Select through each and every DataTable just to get the child records.
For example, in a project I got involved in two and half years ago, there was a UI module that is supposed to display questions and answer controls in a single WinForms DataGrid (to be more specific, it was Infragistics' UltraGrid). Some more tricky requirements
The answer control for a question can be anything - text box, check box options, radio button options, drop-down lists, or even to pop up a custom dialog box that may pull more data from a web service.
Depending on what the user answered, it can trigger more sub-questions to appear directly under the parent question. If a different answer is given later, it should expose another set of sub-questions (if any) related to that answer.
The original implementation was written entirely in DataSets, DataTables, and arrays. The amount of looping through the hundreds of rows for multiple tables was purely mind-bending. It did not help the programmer came from a C++ background attempting to ref everything (hello, objects living in the heap use reference variables, like pointers!). Nobody, not even the originally programmer, could explain why the code is doing what it does. I came into the scene more than six months after this, and it was stil flooded with bugs. No wonder the 2nd-generation developer I took over from decided to quit.
Two months of tying to fix the chaotic mess, I took it upon myself to redesign the entire module into an object-oriented graph to solve this problem. yeap, complete with abstract classes (to render different answer control on a grid cell depending on question type), delegates and eventing. The end result was a 2D dataGrid binded to a deep hierarchy of questions, naturally sorted according to the parent-child arrangement. When a parent question's answer changed, it would raise an event to the children questions and they would automatically show/hide their rows in the grid according to the parent's answer. Only question objects down that path were affected. The UI responsiveness of this solution compared to the old method was by orders of magnitude.
Ironically, I wanted to post a question that was the exact opposite of this. Most of the programmers I've worked with have gone with the custom data objects/collections approach. It breaks my heart to watch someone with their SQL Server table definition open on one monitor, slowly typing up a matching row-wrapper class in Visual Studio in another monitor (complete with private properties and getters-setters for each column). It's especially painful if they're also prone to creating 60-column tables. I know there are ORM systems that can build these classes automagically, but I've seen the manual approach used much more frequently.
Engineering choices always involve trade-offs between the pros and cons of the available options. The DataSet-centric approach has its advantages (db-table-like in-memory representation of actual db data, classes written by people who know what they're doing, familiar to large pool of developers etc.), as do custom data objects (compile-type checking, users don't need to learn SQL etc.). If everyone else at your company is going the DataSet route, it's at least technically possible that DataSets are the best choice for what they're doing.
Datasets/tables aren't so bad are they?
Best advise I can give is to use it as much as you can in your own code, and hopefully through peer reviews and bugfixes, the other developers will see how code becomes more readable. (make sure to push the point when these occurrences happen).
Ultimately if the code works, then the rest is semantics is my view.
I guess you can trying selling the idea of O/R mapping and mapper tools. The benefit of treating rows as objects is pretty powerful.
I think you should focus on the performance. If you can create an application that shows the performance difference when using DataSets vs Custom Entities. Also, try to show them Domain Driven Design principles and how it fits with entity frameworks.
Don't make it a religion or faith discussion. Those are hard to win (and is not what you want anyway)
Don't frame it the way you just did in your question. The issue is not getting anyone to agree that this way or that way is the general way they should work. You should talk about how each one needs to think in order to make the right choice at any given time. give an example for when to use dataSet, and when not to.
I had developers using dataTables to store data they fetched from the database and then have business logic code using that dataTable... And I showed them how I reduced the time to load a page from taking 7 seconds of 100% CPU (on the web server) to not being able to see the CPU line move at all.. by changing the memory object from dataTable to Hash table.
So take an example or case that you thing is better implemented differently, and win that battle. Don't fight the a high level war...
If Interoperability is/will be a concern down the line, DataSet is definitely not the right direction to go in. You CAN expose DataSets/DataTables over a service but whether you SHOULD or is debatable. If you are talking .NET->.NET you're probably Ok, otherwise you are going to have a very unhappy client developer from the other side of the fence consuming your service
You can't convince them otherwise. Pick a smaller challenge or move to a different organization. If your manager respects you see if you can do a project in the domain-driven style as a sort of technology trial.
If you can profile, just Do it and profile. Datasets are heavier then a simple Collection<T>
DataReaders are faster then using Adapters...
Changing behavior in an objects is much easier than massaging a dataset
Anyway: Just Do It, ask for forgiveness not permission.
Most programmers don't like to stray out of their comfort zones (note that the intersection of the 'most programmers' set and the 'Stack Overflow' set is the probably the empty set). "If it worked before (or even just worked) then keep on doing it". The project I'm currently on required a lot of argument to get the older programmers to use XML/schemas/data sets instead of just CSV files (the previous version of the software used CSV's). It's not perfect, the schemas aren't robust enough at validating the data. But it's a step in the right direction. The code I develop uses OO abstractions on the data sets rather than passing data set objects around. Generally, it's best to teach by example, one small step at a time.
There is already some very good advice here but you'll still have a job to convince your colleagues if all you have to back you up is a few supportive comments on stackoverflow.
And, if they are as sceptical as they sound, you are going to need more ammo.
First, get a copy of Martin Fowler's "Patterns of Enterprise Architecture" which contains a detailed analysis of a variety of data access techniques.
Read it.
Then force them all to read it.
Job done.
data-centric means less code-complexity.
custom objects means potentially hundreds of additional objects to organize, maintain, and generally live with. It's also going to be a bit faster.
I think it's really a code-complexity vs performance question, which can be answered by the needs of your app.
Start small. Is there a utility app you can use to illustrate your point?
For instance, at a place where I worked, the main application had a complicated build process, involving changing config files, installing a service, etc.
So I wrote an app to automate the build process. It had a rudimentary WinForms UI. But since we were moving towards WPF, I changed it to a WPF UI, while keeping the WinForms UI as well, thanks to Model-View-Presenter. For those who weren't familiar with Model-View-Presenter, it was an easily-comprehensible example they could refer to.
Similarly, find something small where you can show them what a non-DataSet app would look like without having to make a major development investment.