IntelliJ IDEA has a nice feature to view the differences in JUnit tests when assertEquals check fails. It's explained on their website https://www.jetbrains.com/help/idea/viewing-and-exploring-test-results.html and it usually looks like this:
However, sometimes the link to see differences is simply missing and it's not possible to compare anymore. I believe it might be caused by the length of the compared strings, as it works when you compare strings of 3k bytes but does not work with strings of 6k bytes.
Is there a confinguration parameter for this or any workaround to make it work with longer strings?
Please see the answer to your question at the issue:
https://youtrack.jetbrains.com/issue/IDEA-142886
You may change the threshold by passing -Didea.junit.message.length.threshold the maximum message length you expect. The threshold was introduced due to performance problems in java.util.regex.Pattern used to detect diff which slows down the tests when output is big.
Related
I’m using Optaplanner to make a schedule and it works quite good.
After reading the documentation I have realised that I should use at least 1 (or more) shadow variables since my drool-file is calling methods that does a lot of calculations based on the value of the planningVariable.
I spent a couple of hours rewriting my code to have a shadow variable, but then I notice that the initial solution was really bad (compared to not having shadow variables) and I had to wait severals of minutes just to get an OK result. Is this normal? It did not look like the initial solution used the shadow variable at all.
The question is very generic, and so my answer will be, too.
Sometimes you can simplify the problem by introducing shadow variables or other forms of caching. If you find the right balance, you can indeed speed up the Drools calculation and - as a result - get to the same solution in a shorter amount of time. And therefore, reach better solutions in the same amount of time.
That said, introducing shadow variables shouldn't really change your scores - only how quickly they're calculated. If you're seeing different scores for the same #PlanningSolution, you have in fact changed your problem and the relative performance is no longer comparable.
Also, you may want to check out environment modes to make sure you haven't inadvertently introduced score corruptions into your problem.
Is it possible for a program cannot find the failure by using dynamic testing, but have fault? any simple example?
Please help! thanks.
Yes. Testing can only prove the absence of bugs for what you tested. Dynamic testing cannot cover all possible inputs and outputs in all environments with all dependencies.
First is to simply not test the code in question. This can be verified by checking the coverage of your test. Even if you achieve 100% coverage there can still be flaws.
Next is to not check all possible types and ranges of inputs. For example, if you have a function that scans for a word in a string, you need to check for...
The word at the start of the string.
The word at the end of the string.
The word in the middle of the string.
A string without the word.
The empty string.
These are known as boundary conditions and include things like:
0
Negative numbers
Empty strings
Null
Extremely large values
Decimals
Unicode
Empty files
Extremely large files
If the code in question keeps state, maybe in an object, maybe in global variables, you have to test that state does not become corrupted or interfere with subsequent runs.
If you're doing parallel processing you must test any number of possibilities for deadlocks or corruption resulting from trying to do the same thing at the same time. For example, two processes trying to write to the same file. Or two processes both waiting for a lock on the same resource. Do they lock only what they need? Do they give up their locks ASAP?
Once you test all the ways the code is supposed to work, you have to test all the ways that it can fail, whether it fails gracefully with an exception (instead of garbage), whether an error leaves it in a corrupted state, and so on. How does it handle resource failure, like failing to connect to a database? This becomes particularly important working with databases and files to ensure a failure doesn't leave things partially altered.
For example, if you're transferring money from one account to another you might write:
my $from_balance = get_balance($from);
my $to_balance = get_balance($to);
set_balance($from, $from_balance - $amount);
set_balance($to, $to_balance + $amount);
What happens if the program crashes after the first set_balance? What happens if another process changes either balance between get_balance and set_balance? These sorts of concurrency issues must be thought of and tested.
There's all the different environments the code could run in. Different operating systems. Different compilers. Different dependencies. Different databases. And all with different versions. All these have to be tested.
The test can simply be wrong. It can be a mistake in the test. It can be a mistake in the spec. Generally one tests the same code in different ways to avoid this problem.
The test can be right, the spec can be right, but the feature is wrong. It could be a bad design. It could be a bad idea. You can argue this isn't a "bug", but if the users don't like it, it needs to be fixed.
If your testing makes use of a lot of mocking, your mocks may not reflect how thing thing being mocked actually behaves.
And so on.
For all these flaws, dynamic testing remains the best we've got for testing more than a few dozen lines of code.
I've got a legacy system that processes extremely complex data that's changing every second. The modularity of the system is quite poor so I can't split the business logic into smaller modules to ease functional testing.
The actual test system is: "close your eyes click and pray", which is not acceptable at all. I want to be confident on the changes we commit on the code.
What are the test good practices, the bibles to read, the changes to operate, to increase confidence in such a system.
The question is not about unit testing, the system wasn't designed for that and it takes too much time to decouple, mock and stub all the dependencies and most of all, we sadly don't have the time and budget for that. I don't want to a philosophic debate about functional testing: I want facts that work in real life.
It sounds like you have yourself a black box as regards testing.
http://en.wikipedia.org/wiki/Black-box_testing
To put it simply, it's horrible, but may be all you can do if you can't isolate the system in any way.
You need to insert known data into your system and compare the result with the known output.
You really need known data & output for
normal values - normal data - you'll find out that it can at least seem to do the right thing
erroneous values - spelling errors, invalid values - so you know that it will tell you if the input is rubbish
out of range - -1 on signed integers, values greater than 2.7 billion (assuming 32bit), and so on - so you know it won't crash out on seriously mis-inputted or corrupted data
dangerous - input that would break the SQL, simulate SQL injection
Lastly make sure that all errors are being carefully handled rather than getting logged and the bad/corrupt/null value getting passed on through the system.
Any processes you can isolate and test that way will make debugging easier, as black box testing can't tell you where the error occurred. This means then you need to diagnose the errors based on what happened, more in the style of House MD than a normal debugging session.
Once you have the different data types listed above, you can test all changes in isolation with them, and then in the system as a whole. Over time as you eventually touch most aspect of the system, you'll have test cases for all areas, and be able to say where failure was most likely to have occurred more easily.
Also: make sure you put tracers in your known data so you don't accidentally indicate a stockmarket crash when you're testing the range limits on a module, so you can take it out of the result flow before it ends up on a CEO's desk.
I hope that's some help
http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052 seems to be the book for these situations.
I have a number Dojo control that shows numbers with 30 digits after point. It formats numbers correctly, but when the number is small enough e.g. 8e-13, control shows something like 8e-13,000000000000000000000000000000 rather than 0,000000000000800000000000000000. Apparently it fails and becomes marked as invalid. I tried to pass "round:-1" to its constraints without any result. I also have noticed that 1.0000000000008 is shown correctly in control. What could be the reason of this strange formatting?
Thanks.
Dijit simply doesn't handle these cases well. It's designed for more simple cases. Exponential representation breaks the formatting routines, so numbers at the extremes simply don't work, as you've noticed. There is an option for format numbers in their exponential notation, but that's largely unimplemented.
I'm, currently working my first project in .NET 4.0 and it requires several thousand string comparisons (I'm searching directories and sometimes entire drives for certain files). For the most part, the strings are quite short because I'm only looking at file paths so I have just made use of String.Contains() to see if the file path string contains my needle string.
I was wondering though, would Regex be a better idea? At what point will the Regex be faster than a standard string comparison? Is it based on the length of the strings being compared or the number of strings being compared?
It's variable. Comparison performance is a complex function of the input data, the culture being used for comparing, case sensitivity and CompareOptions. A Regex object is more expensive to instantiate (unless it's in the Regex cache), so if you're doing a lot of one off comparisons, it not that great to use and I've found it's typically slower than IndexOf(), but YMMV.
Keep in mind that when using Contains/IndexOf that the culture under which the user/thread is running will decide how the comparison is done. That can have a significant impact on performance. Not all cultures are as fast.
The Invariant culture is a very fast culture. If you use a CompareInfo directly, rather than doing String.IndexOf(), it will be somewhat faster still.
CultureInfo.InvariantCulture.CompareInfo.IndexOf(..)
The only way to have some confidence in making the right choice is to benchmark. That said, unless you're shifting through many megabytes of strings, it won't make a difference that matters to anyone. As ChrisF said earlier, focus on readable/maintainble code in that case.
Here's a good article on getting the most out of regex:
Optimizing Regular Expression Performance
If your search expression is simple then I don't think it's worth moving to a Regex - no matter how good you are at coding and reading them it will take you more time to understand the code when you (or more importantly, some one else) look at it again in 6 months time.
If the speed improvements are only marginal stay with the more readable, maintainable code.
I'm just guessing, but I suspect that for simple substring searches there will be little difference in performance between String.Contains(), String.IndexOf() and regex (if anything, I'd guess that regex would never be faster, but might be slower by a miniscule amount).
You shouldn't give any thought about moving to regex unless your requirements are (or become) such that you need to match on something more complex than a substring.
In .Net 4.0 there is an issue with the String.IndexOf call see Hotfix 2467309, it may help you decide your answer.