Software crashes in production environment, no access to debugger. What to do in short-term and long-term? [closed] - testing

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
This is an interview question:
Software crashes in production environment, no access to debugger. What steps would you do to solve the problem short term? Long term? What would you do to prevent it from happening? What tools would you use?
My ideas:
Short term:
Track the log file of the program generate by OS, which may generate some signals about the crash.
Narrow down to the file where the program crashes by adding some print.
Add try-catch in the possible locations.
Find the reason.
Long-term:
Check the whole program design idea, algorithm/data structure usage, to make sure that they are used correctly and suitably.
Test it with different cases that have caused crashes to find the essential reasons
Tools : GDB, Valgrind family, gprof
Any better ideas or solutions?

Short Term
1. The absolute first thing to do is work out what was done to generate the problem and try and reproduce it. If you can do that, you can now track it down in a debugged environment.
2. If it is not reproducible, you need to look through all the information you collected in step one (which will include any logging) and see if you can see a possible problem.
3. If the problem has not been found, you will need to add logging, and lots of it. This is where a "DEBUG" logging setting comes in handy. It will probably slow down the system, and may even mask the problem (which tells you something about the nature of the problem).
4. With the new logging information you can go back to step one. Repeat this until the problem is solved!
In the long term the most obvious thing to do is make sure you have sufficient logging in place, even if it has to be turned on and off, to catch problems. As well as this, you need to try and beef up the testing effort..
When you have tracked down a problem, it is worth noting the type of problem (race condition, scalability, database access, etc.). This gives you an area to apply more automated and manual tests.

You have some good initial ideas, here are my comments:
Add logging to your code - you will get very little information from
the operating system about your code.
If exceptions can be thrown by methods that you call, you should catch them. Don't let them bubble up to the end user!
Run valgrind now, not later
Setup a test environment that simulates your production environment. Start simple, and increase the complexity until you are able to reproduce your issue. You do have a test environment, right?

The very first thing you should do is determine the severity of the problem. This will help to devise your short-term strategy. You will need to have some brief discussions with the major stakeholders in the software (such as the client), or have a project manager do this and report back to you.
In the heat of the moment, this is often the bit overlooked, and rushing a short-term fix almost always means wasting a lot of time not really understanding what needs to be done.
After this, your actual strategy, both long term and short term, is rather dependent on the technology you are using and how it is deployed.
Short term
It is absolutely vital to grab some preliminary information about the crash before attempting to resolve the problem, grab log files, take screenshots, note down system info like memory/CPU usage, archive any temporary data that might be useful.
The short-term action should be to get the system up-and-running again, quickly. Some common approaches to short-term solutions:
Try turning it off and on again... Seriously, 90% of the time this
will get production running again in the short term, at least until
the bug manifests itself again.
Revert to a previous production
release, preferably the latest version that was known to work fairly
reliably.
Run a second instance on another machine and fail-over if
the problem occurs again. This has the added bonus that logs and
system state are preserved after the last crash occurred.
Long term
In the long term, you will want to properly analyse the information you gathered at the time of failure. Where possible, try to reproduce the problem as closely as you can. Revert your code to the version being deployed (you do use version control tools right?), check high-level factors as well as low-level configuration ones. e.g. who was using the system when it crashed? Can they show you what they did?
Debugging and logging may be useful at this stage, and all the usual developer tools such as functional tests and memory profiling tools. A crash could come from a number of sources, from memory protection faults to an unexpected state of a resource. You should compile a list of candidate problems, and cross them off as you gain confidence that they aren't the cause of the crash.

Apart from logging, you can enable creation of mdmp files ( windows ) or the core dumps ( linux ) then examine them later; One downside of this approach is that core dumps can be pretty big. mdmp and core dumps contain the context of the application when the crash occurred.

Related

Should human factor be taken into account when deciding on what process to use? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
When you are deciding on what methodology or process to use for your project, should you take into account the human factors? If there is any resistance to things, do you go with the flow or force people to change?
For example, say you want to push for pair programming but the team members resist to working in that mode (or show dislikes), what would you do? Make them get used to it, try to convince them to do it or go with the flow and let them do what they like?
The human factor is the most important one.
If you consider nothing else, consider the culture and proclivities of the group.
People who want a process to fail will succeed. It's far easier to alter process than to alter people.
First try to reason, you may be wrong:
If you have resistance to certain things, you can usually give your points for why you think it's good, and hear their points for why they think it is bad, and come to some common grounds.
You should never force people into doing something against their will, but instead try to convince them based on your logical reasoning. Many times you will see reasons from them that changes your point of view.
If your developer is too afraid to voice their opinion, then you should make them feel comfortable with giving their opinion. If they are still reluctant, then you should consider new developers.
Foot in the door principle:
If you want to try some new concept that neither you nor they have experience in, say pair programming, then you can ask them to try it for 1-2 weeks and then you can sit together again after this trial period and assess the effectiveness. I think most people will find it perfectly reasonable to try something new if they have no experience in it, if it is for the purpose of finding out the method's effectiveness, and if it is only for a trial period.
If after this trial period, the thing you were testing was successful, then your developer will be more open to the idea.
Don't change them, find someone who fits:
If you are 100% for some way of doing things, and your developer is 100% against it, and he won't try it and has no logical reason why, instead of trying to change him you're better off finding a developer that will fit into your way of doing things.
If they are 100% against what you want to change, you have to make a decision. Is the developer themselves more important to you, or is the process that you want to change more important.
If you force someone into something they don't want to do, they will find a way to make your method fail.
Yes. Your development process needs to be humane. That said, there are better and worse development practices and you should strive to use the better practices. The best methodologies understand both human strengths and weaknesses and have practices that promote the former and compensate for the latter.
For example, most agile processes put a high value on trusting developers to do the right thing -- to work hard and value quality. They allow developers to have significant input into the process and into the product. This takes advantage of the human quality of rising to expectations. On the other hand, humans have trouble managing too much complexity at one time, so agile practices insist on breaking things down into manageable chunks.
On the other hand, we know that people don't like to do things that don't directly add value to their work. Agile practices, recognizing the value of things like unit testing, insist on this however and require the developer to conform to it despite the initial reluctance. Using TDD compensates for this somewhat by giving real value to developing tests -- you do them first and let them guide the design. It's a bit of the carrot and stick approach to get developers over the initial reluctance to the point where they can experience the value of the method and buy into it on their own.
Adapting the Process
The key to developing a good process with your people lies in adapting the process to the amount of ceremony that you need or want. We use the RUP where I work and one of the central goals of the RUP is to tailor the amount of ceremony in your process to fit your project and the personnel.
For instance, small projects require far less ceremony and tool support. As well, people new to a process need time to adapt. It's best not to flood them with information and let them adapt at their own pace.
Show Me the Money!
To get people to buy into a new process is to let them make a mistake (or present an example form the past) and then show them how the process could have helped prevent the mistake. Try and draw a direct line to show how the process will help them improve the way they work.
For instance: if people are resistant to automating builds and running tests automatically then the next time they release a fix for something that broke a piece of code that was already working use that opportunity to illustrate that an automated test would have caught the error before it got released, saving everyone time and money.
Automation
To ensure people can adapt to a process is to remove as much human intervention from them as you can. Automate builds, tests, reporting as much as possible using information that is automatically captured.
How this helps support process is by removing the "nag" factor. Many people resist new process because they figure it means more work for them to do or extra work that produces little result in the end. By automating existing tasks and gathering data from them you get a lot of benefit without increasing any individual developers workload.
A classic example is continuous integration. Continuous Integration tools like CruiseControl, TeamCity or Hudson can work with version control repositories to extract latest versions of source code, build that code, execute and archive test results and package stuff for deployment. This requires no extra effort on the part of the developer but you get a lot of extra "process" in return. You now know how good your source code is, you can distribute it easily and you can catch bugs earlier.

What do you do when a library you use is no longer maintained? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
One thing I've always found frustrating is when a library I use is no longer maintained. Even looking at update history and community beforehand, I've run into the situation where I check back later to find that the version I'm using is the last version.
Generally this goes unnoticed until a few months have passed, or some bug/limitation has been found. I run into this fairly often when coding in Python, because my desire to upgrade to a new version of the interpreter can easily introduce problems in libraries that worked fine before. My question is: what is the best response to this situation?
Do you become the maintainer of the old library? Even if you're only fixing the bugs you care about, this is still a lot of work. Especially if the library is large, complex, and has less-than-well-documented code (the case more often than not).
Do you switch to a different library (if there is one)? This is also a significant undertaking, with the potential to introduce new bugs, especially if the only alternatives approach the problem from a different angle. This can be true even if you had the foresight to write an abstraction layer for the old library's functionality.
Do you roll your own? It probably ends up as less code than the old library, since you only write the parts you care about. It's therefore easier to maintain in the future. But now you've wasted days/weeks/months to produce something that is probably less functional, and is guaranteed to introduce tons of new bugs.
I realize the answer depends on the specific case: the size of the library, whether source is available, how maintainable it is, how much of it your code uses, how deeply your code relies on it, etc. I'm looking for answers across a range of cases. What are your experiences with this problem?
Well, you've found one argument to lessen the number of external dependencies...
I've come across this in several Java projects I've audited; it seems people have a tendency to drop in a Jar found somewhere on the Web for the tiniest amount of reuse possible from it. The result is a mess of dependencies that ends up undermining the code base. I prefer to use external components sparingly.
It's probably most useful to ask what you can do before. Make a point of evaluating the future lifetime of an external component before you start using it. Do some research on how large its developer community and its user community are. Also, prefer to use a component that has one or two "lesser" alternatives which you could also use.
If there's something you're tempted to use, but it has only one or two people working on it and isn't used much beyond their own project, then you should probably roll your own - or join forces with the maintainers of the component.
I think your really answer is in how do you select third party libraries to include in your code.
If you happen to like constantly upgrading your code to the latest version of the language then by default you can only use libraries that have active communities behind them
In fact I would go as far as saying that the only time that you want to use a third party open source library is when the community behind it is large (say at least 40+ users) and it has undergone a few releases.
For a commercial library the same thing applies how long is the company going to be around and how many other clients use it.
If you can't find a library in this position then ensure that you abstract the third party library out of your code so replacement isn't hard in the future.
When the Java EE framework my employer chose went belly up, we went out and found a newer, better one. Fortunately Spring was available.
We prefer to roll our own for that very reason. We end up with full control over it, full knowledge of how it works, and we can change it any way we want. When our ass is on the line when the blame game is played, we prefer to reduce the risk and do it ourselves.
We had a situation once where we did use an external library, and it got rewritten and repurposed by the author and no longer did what we expected. We rolled over that, wrote our own version, and continued safely.
The bottom line is safety, and minimization of risk.
If the source is available, the licence is open and the library does the job really well, you have the option to fork the library. By doing this, you can also add new features to it. If the library has lots of things to fix and the code is a mess, it is better to find something else to work with.

Scrum, but with no testing or documentation [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
What do you do when you join a team that says they use Scrum, but only use it as a time-management tool and not the whole process?
How can I reinstate back testing and documentation?
I was thinking to start off with adding user stories specifically for testing and documenting.
Perhaps someone else has more experience with this then I do about this as I am sure its not that uncommon.
The key to scrum is that a task be identifiable as "done" before it can be classed as done. How does you company assess whether something is done without reviewing documentation and tests?
Perhaps they have an unusual, but valid, way of doing it. Or perhaps they have missed the point of "done tasks". I'd suggest you start by asking them how they measure down and whether it could be improved. Then suggest documentation and testing as the way of improving the process.
Note that neither testing nor documentation are in fact part of Scrum. Scrum is a pure project management approach - the required engineering practices, like the ones you mention, are supposed to "emerge" during the project. And most specifically, they are supposed to be identified during the heartbeat retrospectives that you do at the end of every sprint. Are you doing those? Can you bring up your concerns there - and are they actually the biggest concerns the team has?
Is the issue that they don't have any documentation and tests, or that they aren't implementing the entire Scrum methodology? Those are 2 very different problems in my mind.
I would much prefer an organization that has taken the time and effort to find and fit a development process that matches their development style as opposed to mandating down from on high the one true process. So I would not be concerned at all if they were using a process that they called Scrum but that didn't meet all the "official" guidelines. Try to determine why the process is the way it is. Chances are that if they have taken the time to tailor it, the team will be receptive to your ideas, especially if you have taken the time to determine why things are the way they are. If you simply approach it as "this isn't Scrum and so isn't right", you will probably not make much headway, but by being pragmatic about the benefits you can likely make some substantial improvements.
Alternatively, if they aren't doing testing and don't have any documentation I would consider that a fairly bad sign. And by documentation I am taking the minimalist view here - a list of features, bug tracking, etc. - I would be very concerned by the absence of these items, less concerned by the absence of items higher up the abstraction list. In the absence of support from management, I would suggest you lead by example. Take it on yourself to setup a simple bug tracking system (there are several - in a pinch, simple text lists in a central location work as well). Don't declare your features complete until someone else has tested it. This can be as simple as walking over to another developer and asking them to try it in front of you. If someone claims a feature is complete, take a few minutes to familiarize yourself with it. If you find a bug, politely mention it to the responsible developer. Slowly build an environment where the team can see the benefits of running tests and tracking features and bugs.
Most teams operate in this manner simply because of a mistaken belief that they don't have time to "do it right", or that they will get to it later. Often this will occur when a simple proof-of-concept done by a developer or two as a side-project turns into a full-on development effort. By showing that it can actually save time and effort, and reducing the initial costs to the rest of the team, you will often find that it becomes ingrained as part of the process without ever actually being officially endorsed or accepted.
If you have management support it will make it much easier, but always be careful to make sure that the team is receptive to the changes. This may mean it takes longer than you want, but so be it, without the team's support any mandated process will fail at the first sign of pressure, which is when you need the process the most.
*Disclaimer - On my last project I spearheaded the movement to tailor the SCRUM process to fit our environment. The "official" process was simply untenable for our client, but it was still an invaluable guide in tailoring our process.
"adding user stories specifically for testing and documenting"
While meta-user stories might make sense in some circles, it rarely works out well. Software folks rarely cope well with meta-user stories, they either don't get the idea that they can change their own processes by writing a story, or -- more typically -- they engineer the meta-user story to death.
When you're interviewing users, it feels like they're making the user story up. Certainly, you're making it up as you listen to them and try to capture it.
When an IT organization tries to make up its own user stories about how IT should work, the process falls apart. Until the organization has done the thing (testing, for example) a bunch of times manually, they're not really qualified to write user stories. Then, after they've done it, they don't need software development processes, they'll just automate the important bits a little at a time.
I think change has to come from a less formal direction. Actually balking at calling something "done" that hasn't been tested is a good starting point.
IT doesn't do things unless forced. So, meet the users and find out why they're not requiring testing. Coach them to require testing. Tell them the consequences and the words to use.
A lot can go wrong in an organization to lead to poor processes. It's important to know what's wrong, and create a demand for change. The best possible thing is to have your boss complaining that you're not fixing it, rather than you suggesting that perhaps it would be good to fix it.
[It doesn't feel right when your boss demands you fix the process, but it's about the only way change will happen.]

How do I encourage code sharing and limit the bug tracking overhead while maintaining flexibility in my releases? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
How are you tracking changes, testing effort for bugs that impact multiple artifacts released separately?
Code sharing is good because it reduces the total number of paths through the code which means more impact for fewer changes and less bugs (or more bugs addressed with fewer changes). For example, we may build a search tool and an indexer that use the same file handling package or model package.
We need to be able to ensure that changes get tested in all the right components and track which changes were included with which released tools. We also don't want to be forced to release the change in all applications at the same time.
Goal: one bug to be tested, scheduled tracked independently against each released application. With automated systems that understand the architecture guiding us to make the right choices.
Bug Split Release Scenario:
We may release a patch of the search tool that contains a performance fix in a util library. Critical for the search tool, the fix is less visible in the indexer so it can wait until the next maintenance release. We want the one bug to be scheduled-tracked-released with the search patch and deferred for the indexer's next maintenance release.
So, when I create a bug in our tracking system (JIRA) I want it to magically become multiple objects.
primary issue describing the problem and tracking the development work
a set of tasks that allow me to track testing effort and for me to track how this issue has been released for each application it impacts.
How can we make the user experience of code sharing low effort to encourage more of it without becoming blind to what changes impacted which releases or forcing people to enter many duplicate bugs?
I'm sure that large scale projects from Eclipse to linux distros have faced this kind of problem and wonder how they have solved it (I'm going to poke around on them next).
Do any of you have experience with this kind of situation and how have you approached it?
In Jira you can allow sub-tasks so you could assign sub-tasks to the main task. You can also allow time tracking on the issues so you know how much time each task is taking and what the difference between estimated and actual is.
You can also enable versioning so you have a road map of what is being done in the next release with a change log. The problem with the road map is that it is only for one project so you can't have a road map that covers all of your projects.
Finally, you can create your own custom workflows to do almost anything you want to do. I've never tried this because we'd have to learn a new language to do it and the reason we got Jira was to decrease development overhead, not increase it by having to customise our bug tracker - but it is possible.
For jira, make use of the affects versions and fixed in versions (plus you can add multiple custom fields, like verified by QA in versions)

What do you do with a developer who does not test his code? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
One of our developers is continually writing code and putting it into version control without testing it. The quality of our code is suffering as a result.
Besides getting rid of the developer, how can I solve this problem?
EDIT
I have talked to him about it number of times and even given him written warning
If you can do code reviews -- that's a perfect place to catch it.
We require reviews prior to merging to iteration trunk, so typically everything is caught then.
If you systematically perform code reviews before allowing a developer to commit the code, well, your problem is mostly solved. But this doesn't seem to be your case, so this is what I recommend:
Talk to the developer. Discuss the consequences for others in the team. Most developers want to be recognized by their peer, so this might be enough. Also point out it is much easier to fix bugs in the code that's fresh in your mind than weeks-old code. This part makes sense if you have some form of code owneship in place.
If this doesn't work after some time, try to put in place a policy that will make commiting buggy code unpleasant for the author. One popular way is to make the person who broke the build responsible for the chores of creating the next one. If your build process is fully automated, look for another menial task to take care of instead. This approach has the added benefit of not pinpointing anyone in particular, making it more acceptable for everybody.
Use disciplinary measures. Depending on the size of your team and of your company, those can take many forms.
Fire the developer. There is a cost associated with keeping bad apples. When you get this far, the developer doesn't care about his fellow developers, and you've got a people problem on your hands already. If the work environment becomes poisoned, you might lose far more - productivity-wise and people-wise - than this single bad developer.
As a developer who rarely tests his own code, I can tell you the one thing that's made me slowly shift my behavior...
Visibility
If the environment allows pushing code out, waiting for users to find problems, and then essentially asking "How about now?" after making a change to the code, there's no real incentive to test your own stuff.
Code reviews and collaboration encourage you to work towards making a quality product much more than if you were just delivering 'Widget X' while your coworkers work on 'Widget Y' and 'Widget Z'
The more visible your work is, the more likely you are to care about how well it works.
Code review. Stick all of your dev's in a room every Monday morning and ask them to bring their most proud code-based accomplishment from the previous week along with them to the meeting.
Let them take the spotlight and get excited about explaining what they did. Have them bring copies of the code so other dev's can see what they're talking about.
We started this process a few months ago, and it's astonishing to see the amount of sub-conscious quality checks that take place. After all, if the dev's are simply asked to talk about what they're most excited about, they'll be totally stoked to show people their code. Then, other dev's will see the quality errors and publicly discuss why they're wrong and how the code should really be written instead.
If this doesn't get your dev to write quality code, he's probably not a good fit for your team.
Make it part of his Annual Review objectives. If he doesn't achieve it, no pay rise.
Sometimes though you do just have to accept that someone is just not right for your team/environment, it should be a last resort and can be tough to handle but if you have exhausted all other options it may be the best thing in the long run.
Tell the developer you would like to see a change in their practices within 2 weeks or you will begin your company's disciplinary procedure. Offer as much help and assistance as you can, but if you can't change this person, he's not right for your company.
Using Cruise Control or a similar tool, you can make checkins automatically trigger a build and unit tests. You would still need to ensure that there are unit tests for any new functionality he adds, which you can do by looking at his checkins.
However, this is a human problem, so a technical solution can only go so far.
Why not just talk to him? He probably won't actually bite you.
Make him "babysit" the build, and become the build manager. This will give him less time to develop code (thus increasing everyone's performance) and teach him why a good build is so necessary.
Enforce test cases - code cannot be submitted without unit test cases. Modify the build system so that if the test cases don't compile and run correctly, or don't exist, then the entire task checkin is denied.
-Adam
Publish stats on test code coverage per developer, this would be after talking to him.
Here are some ideas from a sea shanty.
Intro
What shall we do with a drunken sailor, (3×)
Early in the morning?
Chorus
Wey–hey and up she rises, (3×)
Early in the morning!
Verses
Stick him in a bag and beat him senseless, (3×)
Early in the morning!
Put him in the longboat till he’s sober, (3×)
Early in the morning!
etc. Replace "drunken sailor" with a "sloppy developer".
Depending on the type of version control system you are using you could set up check-in policies that force the code to pass certain requirements before being allowed to check-in. If you are using a sytem like Team Foundation Server it gives you the ability to specify code-coverage and unit testing requirements for check-ins.
You know, this is a perfect opportunity to avoid singling him out (though I agree you need to talk with him) and implement a Test-first process in-house. If the rules aren't clear and the expectations are known to all, I've found that what you describe isn't all that uncommon. I find that doing the test-first development scheme works well for me and improves the code quality.
They may be overly focused on speed rather than quality.
This can tempt some people into rushing through issues to clear their list and see what comes back in bug reports later.
To rectify this balance:
assign only a couple of items at a time in your issue tracking system,
code review and test anything they have "completed" as soon as possible so it will be back with them immediately if there are any problems
talk to them about your expectations about how long an item will take to do properly
Peer programming is another possibility. If he is with another skilled developer on the team who dies meet quality standards and knows procedure then this has a few benifits:
With an experienced developer over his shoulder he will learn what is expected of him and see the difference between his code and code that meets expectations
The other developer can enforce a test first policy: not allowing code to be written until tests have been written for it
Similarly, the other developer can verify that the code is up to standard before it is checked-in reduicing the nmber of bad check-ins
All of this of course requires the company and developers to be receptive to this process which they may not be.
It seems that people have come up with a lot of imaginative and devious answers to this problem. But the fact is that this isn't a game. Devising elaborate peer pressure systems to "name and shame" him is not going to get to the root of the problem, ie. why is he not writing tests?
I think you should be direct. I know you say that you've talked to him, but have you tried to find out why he isn't writing tests? Clearly at this point he knows that he should be, so surely there must be some reason why he isn't doing what he's been told to do. Is it laziness? Procrastination? Programmers are famous for their egos and strong opinions - perhaps he's convinced for some reason that testing is a waste of time, or that his code is always perfect and doesn't need testing. If he's an immature programmer, he might not fully understand the implications of his actions. If he's "too mature" he might be too set in his ways. Whatever the reason, address it.
If it does come down to a matter of opinion, you need to make him understand that he needs to set his own personal opinion aside and just follow the rules. Make it clear that if he can't be trusted to follow the rules then he will be replaced. If he still doesn't, do just that.
One last thing - document all of your discussions along with any problems that occur as a result of his changes. If it comes to the worst you may be forced to justify your decisions, in which case, having documentary evidence will surely be invaluable.
Stick him on his own development branch, and only bring his stuff into the trunk when you know it's thoroughly tested. This might be a place where a distributed source control management tool like GIT or Mercurial would excel. Although with the increased branching/merging support in SVN, you might not have too much trouble managing it.
EDIT
This is only if you can't get rid of him or get him to change his ways. If you simply can't get this behaviour to stop (by changing or firing), then the best you can do is buffer the rest of the team from the bad effects of his coding.
If you are at a place where you can affect the policies, make some changes. Do code reviews before check ins and make testing part of the development cycle.
It seems pretty simple. Make it a requirement and if he can't do it, replace him. Why would you keep him?
I usually don't advocate this unless all else fails...
Sometimes, a publicly-displayed chart of bug-count-by-developer can apply enough peer pressure to get favorable results.
Try the Carrot, make it a fun game.
E.g The Continuous Integration Game plugin for Hudson
http://wiki.hudson-ci.org/display/HUDSON/The+Continuous+Integration+Game+plugin
Put your developers on branches of your code, based on some logic like, per feature, per bug fix, per dev team, whatever. Then bad check-ins are isolated to those branches. When it comes time to do a build, merge to a testing branch, find problems, resolve, and then merge your release back to a main branch.
Or remove commit rights for that developer and have them send their code to a younger developer for review and testing before it can be committed. That might motivate a change in procedure.
You could put together a report with errors found in the code with the name of the programmer that was responsible for that piece of software.
If he's a reasonable person, discuss the report with him.
If he cares for his "reputation" publish the report regularly and make it available to all his peers.
If he only listens to the "authority", do the report and escalate the issue to his manager.
Anyway, I've seen often that when people are made aware of how bad they seem from outside, they change their behaviour.
Hey this reminds me of something I read on xkcd :)
Are you referring to writing automated unit test or manually unit testing prior to check-in?
If your shop does not write automated tests then his checking in of code that does not work is reckless. Is it impacting the team? Do you have a formalized QA department?
If you are all creating automated unit tests then I would suggest that part of your code review process include the unit tests as well. It will become obvious that the code is not acceptable per your standards during your review.
Your question is rather broad but I hope I provided some direction.
I would agree with Phil that the first step is to individually talk to him and explain the importance of quality. Poor quality can often be linked to the culture of the team, department and company.
Make executed test cases one of the deliverables before something is considered "done."
If you don't have executed test cases, then the work is not complete, and if the deadline passes before you have the documented test case execution, then he has not delivered on time, and the consequences would be the same as if he had not completed the development.
If your company's culture would not allow for this, and it values speed over accuracy, then that's probably the root of the problem, and the developer is simply responding to the incentives that are in place -- he is being rewarded for doing a lot of things half-assed rather than fewer things correctly.
Make the person clean latrines. Worked in the Army. And if you work in a group with individuals who eat a lot of Indian food, it wont take long for them to fall in line.
But that's just me...
Every time a developer checks something in that does not compile, put some money in a jar. You'll think twice before checking in then.
Unfortunately if you have already spoken to him many times and given him written warnings I would say it is about time to eliminate him from the team.
You might find some helpful answers here: How to make junior programmers write tests?
I'd be tempted to suggest elaborating a bit on what you've tried and what results you got as this may have changed a bit but here are my initial suggestions:
Is it any tests or comprehensive tests? Some may code blindly and do zero tests, but this is rather rare, IME. Usually there are some tests done but not enough to cover most of the cases that would be comprehensive testing.
Group dynamics may help. I'd assume he is part of a team and that the team's view may be of some help here. In a way this is trying to get peer pressure which is usually a bad thing but sometimes it can be used in good ways.
How well spelled out were the warnings? In a way this can seem childish but there is a chance that what you think of as testing may not be the same as his. Do you want nUnit tests, an excel spreadsheet, logs from his computer, or something else as proof of the existence and use of tests? From what you've described there isn't anything to confirm that he did understand what you meant, was going to use tests and provide evidence of doing so.
Check-in policy question. Some places, such as my current workplace, encourage committing often which can mean that one does commit code without tests. Is there a known, accepted and well-followed policy where you are? That's another aspect here.