How to functionally test an extremely complex system? - testing

I've got a legacy system that processes extremely complex data that's changing every second. The modularity of the system is quite poor so I can't split the business logic into smaller modules to ease functional testing.
The actual test system is: "close your eyes click and pray", which is not acceptable at all. I want to be confident on the changes we commit on the code.
What are the test good practices, the bibles to read, the changes to operate, to increase confidence in such a system.
The question is not about unit testing, the system wasn't designed for that and it takes too much time to decouple, mock and stub all the dependencies and most of all, we sadly don't have the time and budget for that. I don't want to a philosophic debate about functional testing: I want facts that work in real life.

It sounds like you have yourself a black box as regards testing.
http://en.wikipedia.org/wiki/Black-box_testing
To put it simply, it's horrible, but may be all you can do if you can't isolate the system in any way.
You need to insert known data into your system and compare the result with the known output.
You really need known data & output for
normal values - normal data - you'll find out that it can at least seem to do the right thing
erroneous values - spelling errors, invalid values - so you know that it will tell you if the input is rubbish
out of range - -1 on signed integers, values greater than 2.7 billion (assuming 32bit), and so on - so you know it won't crash out on seriously mis-inputted or corrupted data
dangerous - input that would break the SQL, simulate SQL injection
Lastly make sure that all errors are being carefully handled rather than getting logged and the bad/corrupt/null value getting passed on through the system.
Any processes you can isolate and test that way will make debugging easier, as black box testing can't tell you where the error occurred. This means then you need to diagnose the errors based on what happened, more in the style of House MD than a normal debugging session.
Once you have the different data types listed above, you can test all changes in isolation with them, and then in the system as a whole. Over time as you eventually touch most aspect of the system, you'll have test cases for all areas, and be able to say where failure was most likely to have occurred more easily.
Also: make sure you put tracers in your known data so you don't accidentally indicate a stockmarket crash when you're testing the range limits on a module, so you can take it out of the result flow before it ends up on a CEO's desk.
I hope that's some help

http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052 seems to be the book for these situations.

Related

Is it possible for a program cannot find the failure by using dynamic testing, but have fault?

Is it possible for a program cannot find the failure by using dynamic testing, but have fault? any simple example?
Please help! thanks.
Yes. Testing can only prove the absence of bugs for what you tested. Dynamic testing cannot cover all possible inputs and outputs in all environments with all dependencies.
First is to simply not test the code in question. This can be verified by checking the coverage of your test. Even if you achieve 100% coverage there can still be flaws.
Next is to not check all possible types and ranges of inputs. For example, if you have a function that scans for a word in a string, you need to check for...
The word at the start of the string.
The word at the end of the string.
The word in the middle of the string.
A string without the word.
The empty string.
These are known as boundary conditions and include things like:
0
Negative numbers
Empty strings
Null
Extremely large values
Decimals
Unicode
Empty files
Extremely large files
If the code in question keeps state, maybe in an object, maybe in global variables, you have to test that state does not become corrupted or interfere with subsequent runs.
If you're doing parallel processing you must test any number of possibilities for deadlocks or corruption resulting from trying to do the same thing at the same time. For example, two processes trying to write to the same file. Or two processes both waiting for a lock on the same resource. Do they lock only what they need? Do they give up their locks ASAP?
Once you test all the ways the code is supposed to work, you have to test all the ways that it can fail, whether it fails gracefully with an exception (instead of garbage), whether an error leaves it in a corrupted state, and so on. How does it handle resource failure, like failing to connect to a database? This becomes particularly important working with databases and files to ensure a failure doesn't leave things partially altered.
For example, if you're transferring money from one account to another you might write:
my $from_balance = get_balance($from);
my $to_balance = get_balance($to);
set_balance($from, $from_balance - $amount);
set_balance($to, $to_balance + $amount);
What happens if the program crashes after the first set_balance? What happens if another process changes either balance between get_balance and set_balance? These sorts of concurrency issues must be thought of and tested.
There's all the different environments the code could run in. Different operating systems. Different compilers. Different dependencies. Different databases. And all with different versions. All these have to be tested.
The test can simply be wrong. It can be a mistake in the test. It can be a mistake in the spec. Generally one tests the same code in different ways to avoid this problem.
The test can be right, the spec can be right, but the feature is wrong. It could be a bad design. It could be a bad idea. You can argue this isn't a "bug", but if the users don't like it, it needs to be fixed.
If your testing makes use of a lot of mocking, your mocks may not reflect how thing thing being mocked actually behaves.
And so on.
For all these flaws, dynamic testing remains the best we've got for testing more than a few dozen lines of code.

How to quickly analyse the impact of a program change?

Lately I need to do an impact analysis on changing a DB column definition of a widely used table (like PRODUCT, USER, etc). I find it is a very time consuming, boring and difficult task. I would like to ask if there is any known methodology to do so?
The question also apply to changes on application, file system, search engine, etc. At first, I thought this kind of functional relationship should be pre-documented or some how keep tracked, but then I realize that everything can have changes, it would be impossible to do so.
I don't even know what should be tagged to this question, please help.
Sorry for my poor English.
Sure. One can technically at least know what code touches the DB column (reads or writes it), by determining program slices.
Methodology: Find all SQL code elements in your sources. Determine which ones touch the column in question. (Careful: SELECT ALL may touch your column, so you need to know the schema). Determine which variables read or write that column. Follow those variables wherever they go, and determine the code and variables they affect; follow all those variables too. (This amounts to computing a forward slice). Likewise, find the sources of the variables used to fill the column; follow them back to their code and sources, and follow those variables too. (This amounts to computing a backward slice).
All the elements of the slice are potentially affecting/affected by a change. There may be conditions in the slice-selected code that are clearly outside the conditions expected by your new use case, and you can eliminate that code from consideration. Everything else in the slices you may have inspect/modify to make your change.
Now, your change may affect some other code (e.g., a new place to use the DB column, or combine the value from the DB column with some other value). You'll want to inspect up and downstream slices on the code you change too.
You can apply this process for any change you might make to the code base, not just DB columns.
Manually this is not easy to do in a big code base, and it certainly isn't quick. There is some automation to do for C and C++ code, but not much for other languages.
You can get a bad approximation by running test cases that involve you desired variable or action, and inspecting the test coverage. (Your approximation gets better if you run test cases you are sure does NOT cover your desired variable or action, and eliminating all the code it covers).
Eventually this task cannot be automated or reduced to an algorithm, otherwise there would be a tool to preview refactored changes. The better you wrote code in the beginning, the easier the task.
Let me explain how to reach the answer: isolation is the key. Mapping everything to object properties can help you automate your review.
I can give you an example. If you can manage to map your specific case to the below, it will save your life.
The OR/M change pattern
Like Hibernate or Entity Framework...
A change to a database column may be simply previewed by analysing what code uses a certain object's property. Since all DB columns are mapped to object properties, and assuming no code uses pure SQL, you are good to go for your estimations
This is a very simple pattern for change management.
In order to reduce a file system/network or data file issue to the above pattern you need other software patterns implemented. I mean, if you can reduce a complex scenario to a change in your objects' properties, you can leverage your IDE to detect the changes for you, including code that needs a slight modification to compile or needs to be rewritten at all.
If you want to manage a change in a remote service when you initially write your software, wrap that service in an interface. So you will only have to modify its implementation
If you want to manage a possible change in a data file format (e.g. length of field change in positional format, column reordering), write a service that maps that file to object (like using BeanIO parser)
If you want to manage a possible change in file system paths, design your application to use more runtime variables
If you want to manage a possible change in cryptography algorithms, wrap them in services (e.g. HashService, CryptoService, SignService)
If you do the above, your manual requirements review will be easier. Because the overall task is manual, but can be aided with automated tools. You can try to change the name of a class's property and see its side effects in the compiler
Worst case
Obviously if you need to change the name, type and length of a specific column in a database in a software with plain SQL hardcoded and shattered in multiple places around the code, and worse many tables present similar column namings, plus without project documentation (did I write worst case, right?) of a total of 10000+ classes, you have no other way than manually exploring your project, using find tools but not relying on them.
And if you don't have a test plan, which is the document from which you can hope to originate a software test suite, it will be time to make one.
Just adding my 2 cents. I'm assuming you're working in a production environment so there's got to be some form of unit tests, integration tests and system tests already written.
If yes, then a good way to validate your changes is to run all these tests again and create any new tests which might be necessary.
And to state the obvious, do not integrate your code changes into the main production code base without running these tests.
Yet again changes which worked fine in a test environment may not work in a production environment.
Have some form of source code configuration management system like Subversion, GitHub, CVS etc.
This enables you to roll back your changes

What techniques are available for memory optimizing in 8051 assembly language?

I need to optimize code to get room for some new code. I do not have the space for all the changes. I can not use code bank switching (80c31 with 64k).
You haven't really given a lot to go on here, but there are two main levels of optimizations you can consider:
Micro-Optimizations:
eg. XOR A instead of MOV A,0
Adam has covered some of these nicely earlier.
Macro-Optimizations:
Look at the structure of your program, the data structures and algorithms used, the tasks performed, and think VERY hard about how these could be rearranged or even removed. Are there whole chunks of code that actually aren't used? Is your code full of debug output statements that the user never sees? Are there functions specific to a single customer that you could leave out of a general release?
To get a good handle on that, you'll need to work out WHERE your memory is being used up. The Linker map is a good place to start with this. Macro-optimizations are where the BIG wins can be made.
As an aside, you could - seriously- try rewriting parts of your code with a good optimizing C compiler. You may be amazed at how tight the code can be. A true assembler hotshot may be able to improve on it, but it can easily be better than most coders. I used the IAR one about 20 years ago, and it blew my socks off.
With assembly language, you'll have to optimize by hand. Here are a few techniques:
Note: IANA8051P (I am not an 8501 programmer but I have done lots of assembly on other 8 bit chips).
Go through the code looking for any duplicated bits, no matter how small and make them functions.
Learn some of the more unusual instructions and see if you can use them to optimize, eg. A nice trick is to use XOR A to clear the accumulator instead of MOV A,0 - it saves a byte.
Another neat trick is if you call a function before returning, just jump to it eg, instead of:
CALL otherfunc
RET
Just do:
JMP otherfunc
Always make sure you are doing relative jumps and branches wherever possible, they use less memory than absolute jumps.
That's all I can think of off the top of my head for the moment.
Sorry I am coming to this late, but I once had exactly the same problem, and it became a repeated problem that kept coming back to me. In my case the project was a telephone, on an 8051 family processor, and I had totally maxed out the ROM (code) memory. It kept coming back to me because management kept requesting new features, so each new feature became a two step process. 1) Optimize old stuff to make room 2) Implement the new feature, using up the room I just made.
There are two approaches to optimization. Tactical and Strategical. Tactical optimizations save a few bytes at a time with a micro optimization idea. I think you need strategic optimizations which involve a more radical rethinking about how you are doing things.
Something I remember worked for me and could work for you;
Look at the essence of what your code has to do and try to distill out some really strong flexible primitive operations. Then rebuild your top level code so that it does nothing low level at all except call on the primitives. Ideally use a table based approach, your table contains stuff like; Input state, event, output state, primitives.... In other words when an event happens, look up a cell in the table for that event in the current state. That cell tells you what new state to change to (optionally) and what primitive(s) (if any) to execute. You might need multiple sets of states/events/tables/primitives for different layers/subsystems.
One of the many benefits of this approach is that you can think of it as building a custom language for your particular problem, in which you can very efficiently (i.e. with minimal extra code) create new functionality simply by modifying the table.
Sorry I am months late and you probably didn't have time to do something this radical anyway. For all I know you were already using a similar approach! But my answer might help someone else someday who knows.
In the whacked-out department, you could also consider compressing part of your code and only keeping some part that is actively used decompressed at any particular point in time. I have a hard time believing that the code required for the compress/decompress system would be small enough a portion of the tiny memory of the 8051 to make this worthwhile, but has worked wonders on slightly larger systems.
Yet another approach is to turn to a byte-code format or the kind of table-driven code that some state machine tools output -- having a machine understand what your app is doing and generating a completely incomprehensible implementation can be a great way to save room :)
Finally, if the code is indeed compiled in C, I would suggest compiling with a range of different options to see what happens. Also, I wrote a piece on compact C coding for the ESC back in 2001 that is still pretty current. See that text for other tricks for small machines.
1) Where possible save your variables in Idata not in xdata
2) Look at your Jmp statements – make use of SJmp and AJmp
I assume you know it won't fit because you wrote/complied and got the "out of memory" error. :) It appears the answers address your question pretty accurately; short of getting code examples.
I would, however, recommend a few additional thoughts;
Make sure all the code is really
being used -- code coverage test? An
unused sub is a big win -- this is a
tough step -- if you're the original
author, it may be easier -- (well, maybe) :)
Ensure the level of "verification"
and initialization -- sometimes we
have a tendency to be over zealous
in insuring we have initialized
variables/memory and sure enough
rightly so, how many times have we
been bitten by it. Not saying don't
initialize (duh), but if we're doing
a memory move, the destination
doesn't need to be zero'd first --
this dovetails with
1 --
Eval the new features -- can an
existing sub be be enhanced to cover
both functions or perhaps an
existing feature replaced?
Break up big code if a piece of the
big code can save creating a new
little code.
or perhaps there's an argument for hardware version 2.0 on the table now ... :)
regards
Besides the already mentioned (more or less) obvious optimizations, here is a really weird (and almost impossible to achieve) one: Code reuse. And with Code reuse I dont mean the normal reuse, but to a) reuse your code as data or b) to reuse your code as other code. Maybe you can create a lut (or whatever static data) that it can represented by the asm hex opcodes (here you have to look harvard vs von neumann architecture).
The other would reuse code by giving code a different meaning when you address it different. Here an example to make clear what I mean. If the bytes for your code look like this: AABCCCDDEEFFGGHH at address X where each letter stands for one opcode, imagine you would now jump to X+1. Maybe you get a complete different functionality where the now by space seperated bytes form the new opcodes: ABC CCD DE EF GH.
But beware: This is not only tricky to achieve (maybe its impossible), but its a horror to maintain. So if you are not a demo code (or something similiar exotic), I would recommend to use the already other mentioned ways to save mem.

How far can you really go with "eventual" consistency and no transactions (aka SimpleDB)?

I really want to use SimpleDB, but I worry that without real locking and transactions the entire system is fatally flawed. I understand that for high-read/low-write apps it makes sense, since eventually the system becomes consistent, but what about that time in between? Seems like the right query in an inconsistent db would perpetuate havoc throughout the entire database in a way that's very hard to track down. Hopefully I'm just being a worry wart...
This is the pretty classic battle between consistency and scalability and - to some extent - availability. Some data doesn't always need to be that consistent. For instance, look at digg.com and the number of diggs against a story. There's a good chance that value is duplicated in the "digg" record rather than forcing the DB to do a join against the "user_digg" table. Does it matter if that number isn't perfectly accurate? Probably not. Then using something like SimpleDB might be a good fit. However if you are writing a banking system, you should probably value consistency above all else. :)
Unless you know from day 1 that you have to deal with massive scale, I would stick to simple more conventional systems like RDBMS. If you are working somewhere with a reasonable business model, you will hopefully see a big spike in revenue if there's a big spike in traffic. Then you can use that money to help solving the scaling problems. Scaling is hard and scaling is hard to predict. Most of the scaling problems that hurt you will be ones that you never expect.
I would much rather get a site off the ground and spend a few weeks fixing scale issues when traffic picks up then spend so much time worrying about scale that we never make it to production because we run out of money. :)
Assuming you're talking about this SimpleDB, you're not being a worrywart; there are real reasons not to use it as a real world DBMS.
The properties that you get from transaction support in a DBMS can be abbreviated by the acronym "A.C.I.D.": Atomicity, Consistency, Isolation, and Durability. The A and D have mostly to do with system crashes, and the C and I have to do with regular operation. They're all things people totally take for granted when working with commercial databases, so if you work with a database that doesn't have one or more of them, you might be in for any number of nasty surprises.
Atomicity: Any transaction will either complete fully or not at all (i.e. it will either commit or abort cleanly). This applies to single statements (like "UPDATE table ...") as well as longer, more complicated transactions. If you don't have this, then anything that goes wrong (like, the disk getting full, the computer crashing, etc.) might leave something half-done. In other words, you can't ever rely on the DBMS to really do the things you tell it to, because any number of real-world problems can get in the way, and even a simple UPDATE statement might get partially completed.
Consistency: Any rules you've set up about the database will always be enforced. Like, if you have a rule that says A always equals B, then nothing anybody does to the database system can break that rule - it'll fail any operation that tries. This isn't quite as important if all your code is perfect ... but really, when is that ever the case? Plus, if you're missing this safety net, things get really yucky when you lose ...
Isolation: Any actions taken on the database will execute as if they happened serially (one at a time), even if in reality they're happening concurrently (interleaved with each other). If more than one user is going to hit this database at the same time, and you don't have this, then things you can't even dream up will go wrong; even atomic statements can interact with each other in unforeseen ways and screw things up.
Durability: If you lose power or the software crashes, what happens to database transactions that were in progress? If you have durability, the answer is "nothing - they're all safe". Databases do this by using something called "Undo / Redo Logging", where every little thing you do to the database is first logged (typically on a separate disk for safety) in a way such that you can reconstruct the current state after a failure. Without that, the other properties above are sort of useless, because you can never be 100% sure that things will stay consistent after a crash.
Do any of these things matter to you? The answer has everything to do with the types of transactions you're doing, and what guarantees you want in a failure situation. There may well be cases (like a read-only database) where you don't need these, but as soon as you start doing anything non-trivial, and something bad happens, you'll wish you had 'em. Maybe it's OK for you to just revert to a backup anytime something unexpected happens, but my guess is that it isn't.
Also note that dropping all of these protections doesn't make it a given that your database will perform better; in fact, it's probably the opposite. That's because real-world DBMS software also has tons of code to optimize query performance. So, if you write a query that joins 6 tables on SimpleDB, don't assume that it'll figure out the optimal way to run that query - you might end up waiting hours for it to complete, when a commercial DBMS could use an indexed hash join and get it in .5 seconds. There are a zillion little tricks that you can do to optimize query performance, and believe me, you'll really miss them when they're gone.
None of this is meant as a knock on SimpleDB; take it from the author of the software: "Although it is a great teaching tool, I can't imagine that anyone would want to use it for anything else."

Performance metrics on specific routines: any best practices?

I'd like to gather metrics on specific routines of my code to see where I can best optimize. Let's take a simple example and say that I have a "Class" database with multiple "Students." Let's say the current code calls the database for every student instead of grabbing them all at once in a batch. I'd like to see how long each trip to the database takes for every student row.
This is in C#, but I think it applies everywhere. Usually when I get curious as to a specific routine's performance, I'll create a DateTime object before it runs, run the routine, and then create another DateTime object after the call and take the milliseconds difference between the two to see how long it runs. Usually I just output this in the page's trace...so it's a bit lo-fi. Any best practices for this? I thought about being able to put the web app into some "diagnostic" mode and doing verbose logging/event log writing with whatever I'm after, but I wanted to see if the stackoverflow hive mind has a better idea.
For database queries, you have a two small problems. Cache: data cache and statement cache.
If you run the query once, the statement is parsed, prepared, bound and executed. Data is fetched from files into cache.
When you execute the query a second time, the cache is used, and performance is often much, much better.
Which is the "real" performance number? First one or second one? Some folks say "worst case" is the real number, and we have to optimize that. Others say "typical case" and run the query twice, ignoring the first one. Others says "average" and run in 30 times, averaging them all. Other say "typical average", run the 31 times and average the last 30.
I suggest that the "last 30 of 31" is the most meaningful DB performance number. Don't sweat the things you can't control (parse, prepare, bind) times. Sweat the stuff you can control -- data structures, I/O loading, indexes, etc.
I use this method on occasion and find it to be fairly accurate. The problem is that in large applications with a fairly hefty amount of debugging logs, it can be a pain to search through the logs for this information. So I use external tools (I program in Java primarily, and use JProbe) which allow me to see average and total times for my methods, how much time is spent exclusively by a particular method (as opposed to the cumulative time spent by the method and any method it calls), as well as memory and resource allocations.
These tools can assist you in measuring the performance of entire applications, and if you are doing a significant amount of development in an area where performance is important, you may want to research the tools available and learn how to use one.
Some times approach you take will give you a best look at you application performance.
One things I can recommend is to use System.Diagnostics.Stopwatch instead of DateTime ,
DateTime is accurate only up to 16 ms where Stopwatch is accurate up to the cpu tick.
But I recommend to complement it with custom performance counters for production and running the app under profiler during development.
There are some Profilers available but, frankly, I think your approach is better. The profiler approach is overkill. Maybe the use of profilers is worth the trouble if you absolutely have no clue where the bottleneck is. I would rather spend a little time analyzing the problem up front and putting a few strategic print statements than figure out how to instrument your app for profiling then pour over gargantuan reports where every executable line of code is timed.
If you're working with .NET, then I'd recommend checking out the Stopwatch class. The times you get back from that are going to be much more accurate than an equivalent sample using DateTime.
I'd also recommend checking out ANTS Profiler for scenarios in which performance is exceptionally important.
It is worth considering investing in a good commercial profiler, particularly if you ever expect to have to do this a second time.
The one I use, JProfiler, works in the Java world and can attach to an already-running application, so no special instrumentation is required (at least with the more recent JVMs).
It very rapidly builds a sorted list of hotspots in your code, showing which methods your code is spending most of its time inside. It filters pretty intelligently by default, and allows you to tune the filtering further if required, meaning that you can ignore the detail of third party libraries, while picking out those of your methods which are taking all the time.
In addition, you get lots of other useful reports on what your code is doing. It paid for the cost of the licence in the time I saved the first time I used it; I didn't have to add in lots of logging statements and construct a mechanism to anayse the output: the developers of the profiler had already done all of that for me.
I'm not associated with ej-technologies in any way other than being a very happy customer.
I use this method and I think it's very accurate.
I think you have a good approach. I recommend that you produce "machine friendly" records in the log file(s) so that you can parse them more easily. Something like CSV or other-delimited records that are consistently structured.