FIFO semaphore test - testing

I have implemented FIFO semaphores but now I need a way to test/prove that they are working properly. A simple test would be to create some threads that try to wait on a semaphore and then print a message with a number and if the numbers are in order it should be FIFO, but this is not good enough to prove it because that order could have occurred by chance. Thus, I need a better way of testing it.
If necessary locks or condition variables can be used too.
Thanks

What you describe with your sentence "but this is not good enough to prove it because that order could have occurred by chance" is somehow a known dilema.
1) Even if you have a specification, you can not ensure that the specification match your intention. To illustrate this I will take an example from "the limit of correctness". Let's consider a specification for a factorization function that is:
Compute A and B such as A * B = C
But it's not enough as you could have an implementation that returns A=1 and B=C. Adding A,B != 1 can still lead to A=-1 and B=-C, so the only correct specification must state A,B>1. That's just to illustrate how complicated it can be to write a specification that match the real intention.
2) Even having proved an algorithm, still doesn't mean the implementation is correct in practice. This is best illustrated with this quote from Donald Knuth:
Beware of bugs in the above code; I
have only proved it correct, not tried
it.
3) Testing can only reveal the presence of bug, not their absence. This quote goes back to Dijkstra:
Testing can be used to show the
presence of bugs but never to show
their absence.
Conclusion: you are doomed and you will never be 100% sure that your code is correct according to its intent! But stuff aren't that bad. Having a high confidence about the code is usually enough. For instance, if using multiple threads is still not enough for you, you can decide to use fuzzing as well so as to randomize the test execution even more. If your tests always pass, well, you can be pretty confident that your code is good.

because that order could have occurred by chance.
You can run the test a few times, e.g. 10, and test that each time the order was correct. This will ensure that it happened not by chance.
P.S. Multiple threads in a unit test is usually avoided

Related

Is it better to use a boolean variable to replace an if condition for readability or not?

I am in the second year of my bachelor study in information technology. Last year in one of my courses they taught me to write clean code so other programmers have an easier time working with your code. I learned a lot about writing clean code from a video ("clean code") on pluralsight (paid website for learning which my school uses). There was an example in there about assigning if conditions to boolean variables and using them to enhance readability. In my course today my teacher told me it's very bad code because it decreases performance (in bigger programs) due to increased tests being executed. I was wondering now whether I should continue using boolean variables for readability or not use them for performance. I will illustrate in an example (I am using python code for this example):
example boolean variable
Let's say we need to check whether somebody is legal to drink alcohol we get the persons age and we know the legal drinking age is 21.
is_old_enough = persons_age >= legal_drinking_age
if is_old_enough:
do something
My teacher told me today that this would be very bad for performance since 2 tests are performed first persons_age >= legal_drinking_age is tested and secondly in the if another test occurs whether the person is_old_enough.
My teacher told me that I should just put the condition in the if, but in the video they said that code should be read like natural language to make it clear for other programmers. I was wondering now which would be the better coding practice.
example condition in if:
if persons_age >= legal_drinking_age:
do something
In this example only 1 test is tested whether persons_age >= legal_drinking_age. According to my teacher this is better code.
Thank you in advance!
yours faithfully
Jonas
I was wondering now which would be the better coding practice.
The real safe answer is : Depends..
I hate to use this answer, but you won't be asking unless you have faithful doubt. (:
IMHO:
If the code will be used for long-term use, where maintainability is important, then a clearly readable code is preferred.
If the program speed performance crucial, then any code operation that use less resource (smaller dataSize/dataType /less loop needed to achieve the same thing/ optimized task sequencing/maximize cpu task per clock cycle/ reduced data re-loading cycle) is better. (example keyword : space-for-time code)
If the program minimizing memory usage is crucial, then any code operation that use less storage and memory resource to complete its operation (which may take more cpu cycle/loop for the same task) is better. (example: small devices that have limited data storage/RAM)
If you are in a race, then you may what to code as short as possible, (even if it may take a slightly longer cpu time later). example : Hackathon
If you are programming to teach a team of student/friend something.. Then readable code + a lot of comment is definitely preferred .
If it is me.. I'll stick to anything closest to assembly language as possible (as much control on the bit manipulation) for backend development. and anything closest to mathematica-like code (less code, max output, don't really care how much cpu/memory resource is needed) for frontend development. ( :
So.. If it is you.. you may have your own requirement/preference.. from the user/outsiders/customers point of view.. it is just a working/notWorking program. YOur definition of good program may defer from others.. but this shouldn't stop us to be flexible in the coding style/method.
Happy exploring. Hope it helps.. in any way possible.
Performance
Performance is one of the least interesting concerns for this question, and I say this as one working in very performance-critical areas like image processing and raytracing who believes in effective micro-optimizations (but my ideas of effective micro-optimization would be things like improving memory access patterns and memory layouts for cache efficiency, not eliminating temporary variables out of fear that your compiler or interpreter might allocate additional registers and/or utilize additional instructions).
The reason it's not so interesting is, because, as pointed out in the comments, any decent optimizing compiler is going to treat those two you wrote as equivalent by the time it finishes optimizing the intermediate representation and generates the final results of the instruction selection/register allocation to produce the final output (machine code). And if you aren't using a decent optimizing compiler, then this sort of microscopic efficiency is probably the last thing you should be worrying about either way.
Variable Scopes
With performance aside, the only concern I'd have with this convention, and I think it's generally a good one to apply liberally, is for languages that don't have a concept of a named constant to distinguish it from a variable.
In those cases, the more variables you introduce to a meaty function, the more intellectual overhead it can have as the number of variables with a relatively wide scope increases, and that can translate to practical burdens in maintenance and debugging in extreme cases. If you imagine a case like this:
some_variable = ...
...
some_other_variable = ...
...
yet_another_variable = ...
(300 lines more code to the function)
... in some function, and you're trying to debug it, then those variables combined with the monstrous size of the function starts to multiply the difficulty of trying to figure out what went wrong. That's a practical concern I've encountered when debugging codebases spanning millions of lines of code written by all sorts of people (including those no longer on the team) where it's not so fun to look at the locals watch window in a debugger and see two pages worth of variables in some monstrous function that appears to be doing something incorrectly (or in one of the functions it calls).
But that's only an issue when it's combined with questionable programming practices like writing functions that span hundreds or thousands of lines of code. In those cases it will often improve everything just focusing on making reasonable-sized functions that perform one clear logical operation and don't have more than one side effect (or none ideally if the function can be programmed as a pure function). If you design your functions reasonably then I wouldn't worry about this at all and favor whatever is readable and easiest to comprehend at a glance and maybe even what is most writable and "pliable" (to make changes to the function easier if you anticipate a future need).
A Pragmatic View on Variable Scopes
So I think a lot of programming concepts can be understood to some degree by just understanding the need to narrow variable scopes. People say avoid global variables like the plague. We can go into issues with how that shared state can interfere with multithreading and how it makes programs difficult to change and debug, but you can understand a lot of the problems just through the desire to narrow variable scopes. If you have a codebase which spans a hundred thousand lines of code, then a global variable is going to have the scope of a hundred thousands of lines of code for both access and modification, and crudely speaking a hundred thousand ways to go wrong.
At the same time that pragmatic sort of view will find it pointless to make a one-shot program which only spans 100 lines of code with no future need for extension avoid global variables like the plague, since a global here is only going to have 100 lines worth of scope, so to speak. Meanwhile even someone who avoids those like the plague in all contexts might still write a class with member variables (including some superfluous ones for "convenience") whose implementation spans 8,000 lines of code, at which point those variables have a much wider scope than even the global variable in the former example, and this realization could drive someone to design smaller classes and/or reduce the number of superfluous member variables to include as part of the state management for the class (which can also translate to simplified multithreading and all the similar types of benefits of avoiding global variables in some non-trivial codebase).
And finally it'll tend to tempt you to write smaller functions as well, since a variable towards the top of some function spanning 500 lines of code is going to also have a fairly wide scope. So anyway, my only concern when you do this is to not let the scope of those temporary, local variables get too wide. And if they do, then the general answer is not necessarily to avoid those variables but to narrow their scope.

Is it possible for a program cannot find the failure by using dynamic testing, but have fault?

Is it possible for a program cannot find the failure by using dynamic testing, but have fault? any simple example?
Please help! thanks.
Yes. Testing can only prove the absence of bugs for what you tested. Dynamic testing cannot cover all possible inputs and outputs in all environments with all dependencies.
First is to simply not test the code in question. This can be verified by checking the coverage of your test. Even if you achieve 100% coverage there can still be flaws.
Next is to not check all possible types and ranges of inputs. For example, if you have a function that scans for a word in a string, you need to check for...
The word at the start of the string.
The word at the end of the string.
The word in the middle of the string.
A string without the word.
The empty string.
These are known as boundary conditions and include things like:
0
Negative numbers
Empty strings
Null
Extremely large values
Decimals
Unicode
Empty files
Extremely large files
If the code in question keeps state, maybe in an object, maybe in global variables, you have to test that state does not become corrupted or interfere with subsequent runs.
If you're doing parallel processing you must test any number of possibilities for deadlocks or corruption resulting from trying to do the same thing at the same time. For example, two processes trying to write to the same file. Or two processes both waiting for a lock on the same resource. Do they lock only what they need? Do they give up their locks ASAP?
Once you test all the ways the code is supposed to work, you have to test all the ways that it can fail, whether it fails gracefully with an exception (instead of garbage), whether an error leaves it in a corrupted state, and so on. How does it handle resource failure, like failing to connect to a database? This becomes particularly important working with databases and files to ensure a failure doesn't leave things partially altered.
For example, if you're transferring money from one account to another you might write:
my $from_balance = get_balance($from);
my $to_balance = get_balance($to);
set_balance($from, $from_balance - $amount);
set_balance($to, $to_balance + $amount);
What happens if the program crashes after the first set_balance? What happens if another process changes either balance between get_balance and set_balance? These sorts of concurrency issues must be thought of and tested.
There's all the different environments the code could run in. Different operating systems. Different compilers. Different dependencies. Different databases. And all with different versions. All these have to be tested.
The test can simply be wrong. It can be a mistake in the test. It can be a mistake in the spec. Generally one tests the same code in different ways to avoid this problem.
The test can be right, the spec can be right, but the feature is wrong. It could be a bad design. It could be a bad idea. You can argue this isn't a "bug", but if the users don't like it, it needs to be fixed.
If your testing makes use of a lot of mocking, your mocks may not reflect how thing thing being mocked actually behaves.
And so on.
For all these flaws, dynamic testing remains the best we've got for testing more than a few dozen lines of code.

What is an error-guessing test case?

I would need a simple explanation on what a error-guessing test case it. Is it dangerous to use? I would appreciate an example.
Best regards,
Erica
Error guessing is documented here: http://en.wikipedia.org/wiki/Error_guessing
It's a name for something that's very common -- guessing where errors might occur based on your previous experience.
For example you have a routine that calculates whether a value inputted by a user from a terminal is a prime number:
You'd test the cases where errors tend to occur:
Empty input
Values that are not integers (floating point, letters, etc)
Values that are boundary cases 2, 3, 4
etc.
I would assume that every tester/QA person would be asked questions like this during an interview. It gives you a chance to talk about what procedures you've used in the past during testing.
I think the method goes like this:
1. Do formal testing
2. Use knowledge gained during formal testing about how the system works to make a list of places where defects might be
3. Design tests to verify whether those defects exist.
By its nature this process is very ad-hoc and unstructured.

Setting up conditional statements to take advantage of instruction prefetch (modern x86)

I've recently picked up Michael Abrash's The Zen of Assembly Language (from 1990) and was reading a section on how instruction prefetching is not always advantageous such as the case when branching occurs (a jump). This is because all of the instructions that were prefetched are no longer the ones to be executed and so more instructions must be fetched.
This reminded me of an optimization from another old book, Tricks of the Game Programming Gurus by Andre LaMothe in which he suggests that when setting up your conditional statements, you put the most frequently (or expected) path first.
For example:
if (booleanThatIsMostLikelyToBeTrue)
{
// ...expected code
// also the code that would've been prefetched
}
else
{
// ...exceptional or less likely code
}
My questions are:
1) Was LaMothe's optimization suggested with this in mind? (I no longer have the book)
2) Is this type of optimization still a worthwhile programming habit on modern machine? (maybe prefetching is handled differently than it used to be?)
You want to set up your code to branch as little as possible, and branch backwards when it does. A more reliable way to do that IF is to always do the common thing then test for the exception:
Do A;
if( test ) Do B;
Of course, this has to be arranged so that anything A does is reversed by B if B occurs.
The point of Zen programming is to try to eliminate the If statements altogether. So for example, instead of looping 10 times (which requires an exit condition test), you just write the same statement 10 times, voila!, no if statement. Another example is if you are looping a list, you use a sentinel to exit the loop, instead of testing an index value.
If you are working in C, it can be difficult to gimick the compiler into doing what you want. Putting something first or second in an IF statement will have no effect on compiled result. Note that it is critical to use the right compiler options. For example, using the /O2 (optimize for speed) in Visual C++ makes a HUGE difference in the compiled efficiency.
These kinds of optimizations can often be useful. However, it's typically something that you do after you've written the program, profiled it, and determined that the program would benefit from doing these micro-optimizations.
Also, many modern compilers have profile-guided optimization, which can relieve you of having to contort your code for performance purposes.

How to functionally test an extremely complex system?

I've got a legacy system that processes extremely complex data that's changing every second. The modularity of the system is quite poor so I can't split the business logic into smaller modules to ease functional testing.
The actual test system is: "close your eyes click and pray", which is not acceptable at all. I want to be confident on the changes we commit on the code.
What are the test good practices, the bibles to read, the changes to operate, to increase confidence in such a system.
The question is not about unit testing, the system wasn't designed for that and it takes too much time to decouple, mock and stub all the dependencies and most of all, we sadly don't have the time and budget for that. I don't want to a philosophic debate about functional testing: I want facts that work in real life.
It sounds like you have yourself a black box as regards testing.
http://en.wikipedia.org/wiki/Black-box_testing
To put it simply, it's horrible, but may be all you can do if you can't isolate the system in any way.
You need to insert known data into your system and compare the result with the known output.
You really need known data & output for
normal values - normal data - you'll find out that it can at least seem to do the right thing
erroneous values - spelling errors, invalid values - so you know that it will tell you if the input is rubbish
out of range - -1 on signed integers, values greater than 2.7 billion (assuming 32bit), and so on - so you know it won't crash out on seriously mis-inputted or corrupted data
dangerous - input that would break the SQL, simulate SQL injection
Lastly make sure that all errors are being carefully handled rather than getting logged and the bad/corrupt/null value getting passed on through the system.
Any processes you can isolate and test that way will make debugging easier, as black box testing can't tell you where the error occurred. This means then you need to diagnose the errors based on what happened, more in the style of House MD than a normal debugging session.
Once you have the different data types listed above, you can test all changes in isolation with them, and then in the system as a whole. Over time as you eventually touch most aspect of the system, you'll have test cases for all areas, and be able to say where failure was most likely to have occurred more easily.
Also: make sure you put tracers in your known data so you don't accidentally indicate a stockmarket crash when you're testing the range limits on a module, so you can take it out of the result flow before it ends up on a CEO's desk.
I hope that's some help
http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052 seems to be the book for these situations.