Related
I want to compare the performance of some web frameworks (Ruby on Rails and ASP MVC3) but I don't know how to get started... Should I measure how fast each framework renders e 10k long loop or how fast its renders 10k lines of html? Are there maybe programs that can help you with this? Also how can the server load be monitored? Any help is appreciated!
Thijs
With respect, this is an unanswerable question. Is a Porche faster than a Prius? Well, no, not when the Porche is in the shop :-).
The answer depends on what you're trying to accomplish, how you do it, and how you code it. For example, Rails goes out of its way to transparently cache as much as it can, and then makes it trivially easy to cache stuff on your command. Of course there's a way to do the same in ASP MVC3, but is it as easy?
Can you find, hire, and train a suitable team in that knows how to use the framework? What's the culture of the organization (Windows or Unix?). I could write a really fast application in MS-Access and the same application poorly in Rails against a high-performance database and the MS-Access app would win. It's far from a given that an application will be written well, optimized, or whatever.
These days, a well-written application is typically performance bound on data I/O, and if this is the case, then it's which database you use that might matter. The loop-test you propose would test almost nothing, unless you're writing an application that calculates pi to the billionth place, or something.
I am sure there are published benchmarks of application frameworks available, but again, they need to make assumptions about what the application actually has to do.
The reality is that any reasonable framework (which includes both of the two you mention) is likely to be as fast as necessary for most scenarios, and again, what you do, and how you architect and implement it are the far more likely culprits for performance problems.
Once you do choose, there's a great (awesome) tool called NewRelic RPM which works with several frameworks -- I use it with Rails, and it gives you internal metrics at a level of detail that is beyond belief.
I don't mean to be glib, or unhelpful. But this is a little bit of a sore spot for me -- in so many cases people say "we should use foo instead of bar because foo's faster", and weeks go by as bar is replaced by foo. And then there are little incompatibilities. And an unexpected bug. And then, well, for some reason the new one is a little slower. And then after it gets optimized, it's finally just as fast.
I'll step down from my soapbox now :-)
I have worked a lot on improving how my code runs (and it's "beauty"), but when is the time to stop fixing, and start working on the UI?
Microsoft (in my opinion) seems to go with the nice-code, while Apple goes with nice UI (although Apple's developer examples do have very nice code).
I'm bad at balancing, when is it the time to work on one or the other?
Make it work, then make it elegant, then make it fast.
If the user doesn't like the program, the quality of the code is irrelevant. Once the user likes the program, then code quality becomes more important. And by "like" I mean a good user experience where the program doesn't crash, fulfills the need that the user has and adhere's to the Principle of Least Surprise.
Note I say "more important" because code is not something the user sees or is interested in. Code quality and "beauty" is important to developers because it's what they see of the program and what they hand over to other developers.
I remember reading something some time ago comparing (in general terms) software developed for the windows platform and software developed for OS X. In general terms it said that windows programs tended to be developed by developers who didn't spend too much time on the UI or thinking about the user experience. Their concentration was getting every piece of functionality they could think of into the program without any thought about it making sense. Mac OS X on the other hand tended to be developed by people concentrating on the user experience first and solving their problems. So it didn't necessarily have as much functionality, but what it did have was directly associated with what the user needed and was easy to use.
So when is it time to stop and think about the UI? The fact that you have asked the question makes me think it's time to stop right now. If anything I'd suggest that even before writing any code you should have drawn out a basic UI and worked out what sort of user experience your program is going to provide If you cannot work that out, you don't want to wasting time writing code because you will never use it.
Thats not to say I think code "beauty" is irrelevant. I spend time on making sure my code is well written, easy to follow and looks "good". But that's after I've figured out the UI and because I've had a lot of experience with cleaning up other people aweful code :-)
This is an incredibly broad question, and the answer is entirely a function of what you're building, why, and for whom.
Some random bits of conventional wisdom that I agree with:
If you're building broadly-applicable consumer software (e.g. for Mac or iOS), then the UI is a critical component of the application. Spend lots of time on it. :) Work with designers. If your software looks like crap, no one will want to use it. (Assuming nobody's going to make them: see next point.)
If you're building internal or enterprise tools, UI polish is probably less important.
Even if you're a one-man (or woman) operation, think of engineering as distinct from "product management". Product management is the thought process that determines what the software should do, and how it should work (and look). Engineering creates the reality, but has different tradeoffs. These are sometimes in conflict, which is hard to handle if you're one person, but try to wear different hats.
In all cases, your code should work, for some reasonable definition of work.
If it's ugly because it's hacked together, you're likely to run into accuracy/correctness problems sooner than if you construct it methodically. It will also be harder to maintain. The tradeoff here is almost never worth it. As you gain experience, you'll more naturally write clean code from the start, even if it's "scratch" code. You'll save time in the long run this way, and not be faced with later wholesale attempts to improve its beauty.
If you're on a team, or working with others, clean code is even more important. Find out if there are any "local" coding conventions, and work closely with colleagues.
As far as improving "how code runs", if you mean performance optimization, don't do any of that until you're sure you need to. Write the simpler code first, even if it's slower. It's likely that it won't be slow enough to matter.
Special case: if you're writing a game, long-term maintainability is less important, because they tend to be "throwaway" at some level. YMMV.
Is there a method for tracking or measuring the cause of bugs which won't result in unintended consequences from development team members? We recently added the ability to assign the cause of a bug in our tracking system. Examples of causes include: bad code, missed code, incomplete requirements, missing requirements, incomplete testing etc. I was not a proponent of this as I could see it leading to unintended behaviors from the dev team. To date this field has been hidden from team members and not actively used.
Now we are in the middle of a project where we have a larger than normal number of bugs and this type of information would be good to have in order to better understand where we went wrong and where we can make improvements in the future (or adjustments now). In order to get good data on the cause of the bugs we would need to open this field up for input by dev and qa team members and I'm worried that will drive bad behaviors. For example people may not want to fix a defect they didn't create because they'll feel it reflects poorly on their performance, or people might waste time arguing over the classification of a defect for similar reasons.
Has anyone found a mechanism to do this type of tracking without driving bad behaviors? Is it possible to expect useful data from team members if we explain to the team the reasoning behind the data (not to drive individual performance metrics, but project success metrics)? Is there another, better way to do this type of thing (a more ad-hoc post mortem or open discussion on the issues perhaps)?
A lot of version control packages have things like svn blame. This is not a direct metric for tracking a bug, but it can tell you who checked in changes to a release that has a major bug in it.
There's also programs like http://www.bugzilla.org/ that help track things over time.
But as far as really digging into why bugs exist, yes, it's definitely worth looking into, though I can't give a standard metric for collecting that information. There are a number of reasons why a system might be very buggy:
Poorly written specs
Rushed timelines
Low-skill programming
Bad morale
Lack of beta or QA testing
Lack of preparing software so that it is even feasible to beta or QA test
Poor ratio of time spent cleaning up bugs vs getting new functionality out
Poor ratio of time spent making bug-free enhancements vs getting functionality out
An exceeding complex system that is easy to break
A changing environment that is outside the code base, such as the machine administration
Blame for mistakes affecting programmer compensation or promotion
That's just to name a few... If too many bugs is a big problem, then management and lead programmers and any other stake-holders in the whole process need to sit down and discuss the issue.
High bug rates can be a symptom of a schedule which is too rushed or inflexible. Switching to a zero defect approach may help. Fix all bugs before working on new code.
Assigning reasons is a good technique to see if you have a problem area. Typical metrics I have seen and encountered are an even split between:
Specification errors (missing, incorrec, etc.)
Application bugs (inccorrect code, missing code, bad data, etc.)
Incorrect tests / no error (generally incorrect expectations, or specifications not yet implemented)
Reveiwing and verifying the defect causes can be useful.
On occasion I've heard people discuss the benefits of keeping track of programming mistakes, if for no other reason than it increases awareness of common errors. I've started to keep a list of bugs that I find in my code, along with what could have led to them. The main question I have is this:
What information related to my mistakes should I be keeping
track of so that I can improve as a
programmer?
And a couple other questions related to this:
How do I use this information once I start logging my mistakes?
Is tracking mistakes truly beneficial?
This is only useful if you are actually vigilant with tracking and reviewing. When I was working on a team, no matter how much documented that for example our servers in the production environment were natted and would not be able to resolve their own domain names or public IP addresses, every 6 months, I'd get a call at 4 AM from the deployment team or dev team that a new developer was responsible for, and they either forgot or were unaware.
I remember a particular engineer who was repsonsible for deploying and he had paper checklists, we built him deployment tools that forced him to record his checklist, yet he would always forgot to set the connection string (resulting in the 4 am phone call). Point is it's only worth it if your going to use it vigilantly.
I've found the best way to use this is by implementing your rules into a code analyzer like fxcop.
I think what is more useful than keeping a log of individual mistakes is making sure you come away with a real understanding of why it was a mistake in the first place. Most mistakes stem from a lack of understanding about one thing or another, correct that understanding and you eliminate an entire set of potential mistakes in the future. If I would log anything it would be what I learned from the experience that I didn't know before, not the specifics of the mistake that was made which is likely to be of limited usefulness when you look back at it later.
what was the mistake
how can it be avoided
add the latter to an appropriate checklist, and refer to it as often as appropriate
I think tracking mistakes can be worthwhile, but in my experience it helps a lot to categorize them at some level.
Every programmer is going to make enough mistakes over the course of their career to fill an encyclopaedia. If you make a huge checklist out of all of them then you're never going to get any coding done because you'll eventually be spending all of your time going over your checklist. So: categorize your mistakes in some way that makes sense to you so you can rifle through your list looking at the most important mistakes for the sort of code you're currently working on.
Also, to add to the above as far as what to collect:
what are the symptoms of the mistake (so you can find it later)
how you actually solved it
Yes, tracking your personal mistakes is beneficial. Refer to the SEI for numerous data points (here's one at random). One such methodology is the Personal Software Process (PSP). It's too long to go into here, but here's a book about it. There's also this free SEI publication on PSP.
If you balk at SEI and think Agile is the way to go, you'll probably get better mileage out of a book like Clean Code: A Handbook of Agile Software Craftsmanship (publisher website).
Bottom line: disciplined developer = good, undisciplined developer = bad.
I would also want to ask the question of how much time would be required to accurately track the mistakes, and if that time could be better spent directly on improving the software instead. If you can do this in a minimal amount of time and are able to refer back to your records to prevent future mistakes, it may be valuable. In the long run though, I think it will be better to stick with an absolutely high level list of common mistakes you make.
I'd look for trends or similar types of errors that tend to recur. Once you're aware of the types of errors you make, you can be alert for them, or you can change your coding style to avoid them.
As an example, I worked very closely with my office-mate in a previous life. At one point I was halfway through my first sentence explaining some odd behavior, and he interrupted with, "increment your counter." I still double-check my loop counters today!
Instead of keeping a log of future mistakes, maybe you could review your bugtracker history, and try to find common types of errors which keep ocurring.
I'm so inspired I think I'll try this tommorow.
"What information related to my mistakes should I be keeping track of so that I can improve as a programmer?"
Read other people blogs. Write your own. You don't have to publish it. But you do have to turn each mistake into a story.
Don't drown this in metadata. This isn't amenable to a database or even a bug tracker. A mistake is a story where someone made a bad choice and recovered from it.
Here's your outline.
Context. What was going on.
Situation. The problem you were trying to solve.
What you did. Why it was bad.
What you should have done. Why it was better.
What changed. What you learned.
You should also record your successes in almost the same form.
Context. What was going on.
Situation. The problem you were trying to solve.
What you did. Why it was good.
What you learned.
When in doubt watch a movie. Seriously. Characters a confronted with choices, make mistakes and recover from those mistakes. That story line is the essence of drama. A mistake is the same thing.
Be careful what you log. Not every bad situation is a mistake. Some situations are inevitable or so far outside your control that nothing can be done about it.
Bad planning on someone else's part which puts you into panic-mode isn't your mistake. You can meticulously log everyone else's mistakes, but what's the point? If you're tracking what other people are doing, you should seek therapy.
Bad planning which provides inadequate budget, schedule or communication about changes may be a compound problem.
You didn't provide enough planning information up front.
In the scramble to cope, you made another mistake which lead to problem resolution and recovery.
In hindsight (which is always 20/20) you can see a decision which was wrong. However, at the time, it was based on best available information. That's not a mistake. Any sentence that begins "If we'd known that X..." is useless for mistake analysis. You can try to write a checklist of all the things you need to know next time you make that decision. But next time, you won't be making exactly that decision and some other piece of information will turn up missing.
Never call other people's actions mistakes.
Find the root causes. It's a root cause when it's something you can't control.
Don't retroactively call information that turned up missing a mistake.
I have been thinking the same thing. For the time being I keep a little spreadsheet with bugs I correct. The purpose of this is to make myself aware of the more common errors being made.
Hopefully I will start to see patterns and avoid making common errors over and over again.
I find that this makes my job more interesting, probably since I am an analytical person who likes to collect data and information in a structured manner.
You should try to relate the errors to use of inadequate tools, habits, methods or knowledge. In most cases you will find there would have been ways to catch the error earlier. Things like type-safety, unit-testing, code-review, coding standards, better API design, better documentation and working environment, comes to my mind.
I use JIRA with a custom workflow. For a bug, the last step before an issue is closed is a screen that allows me to select a "root cause". I've got a history of the root cause of every bug that's occurred in the last 3 systems I've built.
Everyone I work with is obsessed with the data-centric approach to enterprise development and hates the idea of using custom collections/objects. What is the best way to convince them otherwise?
Do it by example and tread lightly. Anything stronger will just alienate you from the rest of the team.
Remember to consider the possibility that they're onto something you've missed. Being part of a team means taking turns learning & teaching.
No single person has all the answers.
If you are working on legacy code (e.g., apps ported from .NET 1.x to 2.0 or 3.5) then it would be a bad idea to depart from datasets. Why change something that already works?
If you are, however, creating a new apps, there a few things that you can cite:
Appeal to experiencing pain in maintaining apps that stick with DataSets
Cite performance benefits for your new approach
Bait them with a good middle-ground. Move to .NET 3.5, and promote LINQ to SQL, for instance: while still sticking to data-driven architecture, is a huge, huge departure to string-indexed data sets, and enforces... voila! Custom collections -- in a manner that is hidden from them.
What is important is that whatever approach you use you remain consistent, and you are completely honest with the pros and cons of your approaches.
If all else fails (e.g., you have a development team that utterly refuses to budge from old practices and is skeptical of learning new things), this is a very, very clear sign that you've outgrown your team it's time to leave your company!
Remember to consider the possibility that they're onto something you've missed. Being part of a team means taking turns learning & teaching.
Seconded. The whole idea that "enterprise development" is somehow distinct from (and usually the implication is 'more important than') normal development really irks me.
If there really is a benefit for using some technology, then you'll need to come up with a considered list of all the pros and cons that would occur if you switched.
Present this list to your co workers along with explanations and examples for each one.
You have to be realistic when creating this list. You can't just say "Saves us lots of time!!! WIN!!" without addressing the fact that sometimes it is going to take MORE time, will require X months to come up to speed on the new tech, etc. You have to show concrete examples where it will save time, and exactly how.
Likewise you can't just skirt over the cons as if they don't matter, your co-workers will call you on it.
If you don't do these things, or come across as just pushing what you personally like, nobody is going to take you seriously, and you'll just get a reputation for being the guy who's full of enthusiasm and energy but has no idea about anything.
BTW. Look out for this particular con. It will trump everything, unless you have a lot of strong cases for all your other stuff:
Requires 12+ months work porting our existing code. You lose.
Of course, "it depends" on the situation. Sometimes DataSets or DataTables are more suited, like if it really is pretty light business logic, flat hierarchy of entities/records, or featuring some versioning capabilities.
Custom object collections shine when you want to implement a deep hierarchy/graph of objects that cannot be efficiently represented in flat 2D tables. What you can demonstrate is a large graph of objects and getting certain events to propagate down the correct branches without invoking inappropriate objects in other branches. That way it is not necessary to loop or Select through each and every DataTable just to get the child records.
For example, in a project I got involved in two and half years ago, there was a UI module that is supposed to display questions and answer controls in a single WinForms DataGrid (to be more specific, it was Infragistics' UltraGrid). Some more tricky requirements
The answer control for a question can be anything - text box, check box options, radio button options, drop-down lists, or even to pop up a custom dialog box that may pull more data from a web service.
Depending on what the user answered, it can trigger more sub-questions to appear directly under the parent question. If a different answer is given later, it should expose another set of sub-questions (if any) related to that answer.
The original implementation was written entirely in DataSets, DataTables, and arrays. The amount of looping through the hundreds of rows for multiple tables was purely mind-bending. It did not help the programmer came from a C++ background attempting to ref everything (hello, objects living in the heap use reference variables, like pointers!). Nobody, not even the originally programmer, could explain why the code is doing what it does. I came into the scene more than six months after this, and it was stil flooded with bugs. No wonder the 2nd-generation developer I took over from decided to quit.
Two months of tying to fix the chaotic mess, I took it upon myself to redesign the entire module into an object-oriented graph to solve this problem. yeap, complete with abstract classes (to render different answer control on a grid cell depending on question type), delegates and eventing. The end result was a 2D dataGrid binded to a deep hierarchy of questions, naturally sorted according to the parent-child arrangement. When a parent question's answer changed, it would raise an event to the children questions and they would automatically show/hide their rows in the grid according to the parent's answer. Only question objects down that path were affected. The UI responsiveness of this solution compared to the old method was by orders of magnitude.
Ironically, I wanted to post a question that was the exact opposite of this. Most of the programmers I've worked with have gone with the custom data objects/collections approach. It breaks my heart to watch someone with their SQL Server table definition open on one monitor, slowly typing up a matching row-wrapper class in Visual Studio in another monitor (complete with private properties and getters-setters for each column). It's especially painful if they're also prone to creating 60-column tables. I know there are ORM systems that can build these classes automagically, but I've seen the manual approach used much more frequently.
Engineering choices always involve trade-offs between the pros and cons of the available options. The DataSet-centric approach has its advantages (db-table-like in-memory representation of actual db data, classes written by people who know what they're doing, familiar to large pool of developers etc.), as do custom data objects (compile-type checking, users don't need to learn SQL etc.). If everyone else at your company is going the DataSet route, it's at least technically possible that DataSets are the best choice for what they're doing.
Datasets/tables aren't so bad are they?
Best advise I can give is to use it as much as you can in your own code, and hopefully through peer reviews and bugfixes, the other developers will see how code becomes more readable. (make sure to push the point when these occurrences happen).
Ultimately if the code works, then the rest is semantics is my view.
I guess you can trying selling the idea of O/R mapping and mapper tools. The benefit of treating rows as objects is pretty powerful.
I think you should focus on the performance. If you can create an application that shows the performance difference when using DataSets vs Custom Entities. Also, try to show them Domain Driven Design principles and how it fits with entity frameworks.
Don't make it a religion or faith discussion. Those are hard to win (and is not what you want anyway)
Don't frame it the way you just did in your question. The issue is not getting anyone to agree that this way or that way is the general way they should work. You should talk about how each one needs to think in order to make the right choice at any given time. give an example for when to use dataSet, and when not to.
I had developers using dataTables to store data they fetched from the database and then have business logic code using that dataTable... And I showed them how I reduced the time to load a page from taking 7 seconds of 100% CPU (on the web server) to not being able to see the CPU line move at all.. by changing the memory object from dataTable to Hash table.
So take an example or case that you thing is better implemented differently, and win that battle. Don't fight the a high level war...
If Interoperability is/will be a concern down the line, DataSet is definitely not the right direction to go in. You CAN expose DataSets/DataTables over a service but whether you SHOULD or is debatable. If you are talking .NET->.NET you're probably Ok, otherwise you are going to have a very unhappy client developer from the other side of the fence consuming your service
You can't convince them otherwise. Pick a smaller challenge or move to a different organization. If your manager respects you see if you can do a project in the domain-driven style as a sort of technology trial.
If you can profile, just Do it and profile. Datasets are heavier then a simple Collection<T>
DataReaders are faster then using Adapters...
Changing behavior in an objects is much easier than massaging a dataset
Anyway: Just Do It, ask for forgiveness not permission.
Most programmers don't like to stray out of their comfort zones (note that the intersection of the 'most programmers' set and the 'Stack Overflow' set is the probably the empty set). "If it worked before (or even just worked) then keep on doing it". The project I'm currently on required a lot of argument to get the older programmers to use XML/schemas/data sets instead of just CSV files (the previous version of the software used CSV's). It's not perfect, the schemas aren't robust enough at validating the data. But it's a step in the right direction. The code I develop uses OO abstractions on the data sets rather than passing data set objects around. Generally, it's best to teach by example, one small step at a time.
There is already some very good advice here but you'll still have a job to convince your colleagues if all you have to back you up is a few supportive comments on stackoverflow.
And, if they are as sceptical as they sound, you are going to need more ammo.
First, get a copy of Martin Fowler's "Patterns of Enterprise Architecture" which contains a detailed analysis of a variety of data access techniques.
Read it.
Then force them all to read it.
Job done.
data-centric means less code-complexity.
custom objects means potentially hundreds of additional objects to organize, maintain, and generally live with. It's also going to be a bit faster.
I think it's really a code-complexity vs performance question, which can be answered by the needs of your app.
Start small. Is there a utility app you can use to illustrate your point?
For instance, at a place where I worked, the main application had a complicated build process, involving changing config files, installing a service, etc.
So I wrote an app to automate the build process. It had a rudimentary WinForms UI. But since we were moving towards WPF, I changed it to a WPF UI, while keeping the WinForms UI as well, thanks to Model-View-Presenter. For those who weren't familiar with Model-View-Presenter, it was an easily-comprehensible example they could refer to.
Similarly, find something small where you can show them what a non-DataSet app would look like without having to make a major development investment.