exploration Program goal returning as false , not sure what to do next - pddl

So Im writing a PDDL for mars exploration but it shows
Suspected timeout.
ff: goal can be simplified to FALSE. No plan will solve it
What should I do next?

That's not timeout: FF proved that it's impossible to reach the goal.
"Simplified" means it didn't even need to search, so it's probably a very fundamental modelling problem.
Try changing the goal to the preconditions of actions that you think should be achievable.

Related

How can I avoid executing a whole failing test and relaunch it from a specific point instead?

Using selenium grid and JUnit5, I am executing tests that are way too long. Some of them may take about 30 minutes to complete. Maybe, what is failing is a silly locator, which I can easily fix in a few seconds. Then, in order to keep testing and check that the change has actually fixed the failure, I have to retest it again from the absolute beginning. So, is there a way to avoid this and to retake the test from a specific point?
Thanks in advance
AFAIK this is not possible.
Also, generally, tests should not be too long and too complex.
Making tests long and complex making the debugging and failures analysis much more complex.

What is a sanity test/check

What is it and why is it used/useful?
A sanity test isn't limited in any way to the context of programming or software engineering. A sanity test is just a casual term to mean that you're testing/confirming/validating something that should follow very clear and simple logic. It's asking someone else to confirm that you are not insane and that what seems to make sense to you also makes sense to them... or did you down way too many energy drinks in the last 4 hours to maintain sanity?
If you're bashing your head on the wall completely at a loss as to why something very simple isn't working... you would ask someone to do a quick sanity test for you. Have them make sure you didn't overlook that semicolon at the end of your for loop the last 15 times you looked it over. Extremely simple example, really shouldn't happen, but sometimes you're too close to something to step back and see the whole. A different perspective sometimes helps to make sure you're not completely insane.
The difference between smoke and sanity, at least as I understand it, is that smoke test is a quick test to see that after a build the application is good enough for testing. Then, you do a sanity test which would tell you if a particular functional area is good enough that it actually makes sense to proceed with tests on this area.
Example:
Smoke Test: I can launch the application and navigate through all the screens and application does not crash.
-If application crashes or I cannot access all screens, this build has something really wrong, there is "a fire" that needs to be extinguished ASAP and the vesion is not good for testing.
Sanity Test (For Users Management screen): I can get to Users Management screen, create a user and delete it.
So, the application passed the Smoke Test, and now I proceed to Sanity Tests for different areas. If I cannot rely on the application to create a user and to delete it, it is worthless to test more advanced functionalities like user expiration, logins, etc... However, if sanity test has passed, I can go on with the test of this area.
Good example is a sanity check for a database connection.
SELECT 1 FROM DUAL
It's a simple query to test the connection, see:
SELECT 1 from DUAL: MySQL
It doesn't test deep functionality, only that the connection is ok to proceed with.
A sanity test or sanity check is a basic test to quickly evaluate whether a claim or the result of a calculation can possibly be true # http://en.wikipedia.org/wiki/Sanity_testing
Smoke test is for quick test of a new build for its stability.
Sanity test is a test of newly deployed environment.
The basic concept behind a sanity check is making sure that the results of running your code line up with the expected results. Other than being something that gets used far less often than it should, a proper sanity check helps ensure that what you're doing doesn't go completely out of bounds and do something it shouldn't as a result. The most common use for a sanity check is to debug code that's misbehaving, but even a final product can benefit from having a few in place to prevent unwanted bugs from emerging as a result of GIGO (garbage in, garbage out).
Relatedly, never underestimate the ability of your users to do something you didn't expect anyone would actually do. This is a lesson that many programmers never learn, no matter how many times it's taught, and sanity checks are an excellent tool to help you come to terms with it. "I'd never do that" is not a valid excuse for why your code didn't handle a problem, and good sanity checks can help prevent you from ever having to make that excuse.
For a software application, a sanity test is a set of many tests that make a software version releasable to the public after the integration of new features and bug fixes. A sanity test means that while many issues could remain, the very critical issues which could for example make someone lose money or data or crash the program, have been fixed. Therefore if no critical issues remain, the version passes sanity test. This is usually the last test done before release.
It is a basic test to make sure that something is simply working.
For example: connecting to a database. Or pinging a website/server to see if it is up or down.
The act of checking a piece of code (or anything else, e.g., a Usenet posting) for completely stupid mistakes.
Implies that the check is to make sure the author was sane when it was written;
e.g., if a piece of scientific software relied on a particular formula and was giving unexpected results, one might first look at the nesting of parentheses or the coding of the formula, as a sanity check, before looking at the more complex I/O or data structure manipulation routines, much less the algorithm itself.

The value of test code coverage tools

We've started using Part Cover to track test code coverage of our application. IMO its a great tool for getting an overall score for your tests and for highlighting test areas where you might have been a bit lazy with tests, but today I wrote a test and realised that it didn't really test anything useful, it just increased my coverage!
If you are TDD, then you only write code to pass a test, and the tests are richly describing all the functionality required by the application. So in this scenario is it still very valuable to have coverage analysis?
For those of you that have coverage tools, how religiously do you adhere to keeping the coverage at 100% and do you ever find yourself writing tests that don't really test anything, but just to keep your coverage up? Isn't this a bad thing ?
Coverage tools should only be used to tell you what has not been tested. The scenario you pointed out illustrates why you can't rely on them to show you what code has been tested. Writing tests just so the coverage is 100% is pointless (as you suspected), and it's so easy to game that this isn't really a useful metric. I used to try and stay at or near 100%, but I came to the same conclusion that you did. I was writing tests that didn't really test anything just so the numbers were right. Use the tools to spot areas that you haven't tested yet, then write good tests or accept the fact that those parts of the code aren't critical.
I'll play devil's advocate: if increasing your coverage meant writing a test that "didn't test anything useful," then why was that code there? To me, this would be an argument to remove some mainline code.
Or to develop a test that does do something useful. For example, you may consider that it's not useful to test setters and getters. Neither do I. However, those methods should be tested while testing something else. Otherwise, again, why are they there?
But you raise a good point that coverage tools should not be an end in themselves. Especially since they can't tell you what code you need to write.
I've gone into more detail here: http://www.kdgregory.com/index.php?page=junit.coverage
If you're doing pure TDD, there's less value to code coverage because as you say, you only write code from tests so you should be at around 100% anyway. but then, it's probably pretty rare (and at times not possible) to be doing it so purely.
if you aren't doing pure TDD, 100% is a pretty unrealistic target anyway. I usually try to go for Roy Osherove's method and only test things with logic (e.g. not straight getters/setter or pass-throughs). But then, higher is always better, and it can be tempting to put a couple more tests in there to increase that coverage..!
Good rationalisation ;) But we are human after all, and I for one sleep much better at night knowing that an untested method or path hasn't made it into production.

Do good tests enable sloppy coding?

Let's say you're coding, and you come across an opportunity for simple code resuse (e.g. pulling a common piece of code out to an accessible place like a Utility class or base class). You might find yourself thinking, "I know it's good to do this, but I have to get this done now, and if I need to make a change to this code, and forget to change it in the other place, my testing framework will let me know."
In other words, you let the awesome tests you (or another developer) has written to remind you to change the code in the other places too.
Is this a legitimate problem that we might find in ourselves or other developers?
You're asking whether unit tests encourage you to rely on them as a method of TODO list? Yes, but I don't think that's sloppy coding. You are, afterall, to start with unit tests failing and code to the test; if you refactor some code and then once again code to the test, that isn't sloppy coding -- it's doing what you're supposed to.
I think the problem with unit tests is simply that you can't cover every corner case in a unit test, and sometimes people assume that a working test means a working app, which isn't true.
In the example you provide, good tests are in fact enabling you to implement sloppy design, however in my experience, bad tests wouldn't have discouraged you from doing the same.
The fallacy in your argument centers around the premise that "getting this done now" means you will save time by implementing sloppy design. The truth of the matter is that you are incurring technical debt whether your tests are good or not. Making a change to that code is now a much more complex task, whether you have a good testing framework to remind you of that or not.
Although immature code may work fine
and be completely acceptable to the
customer, excess quantities will make
a program unmasterable, leading to
extreme specialization of programmers
and finally an inflexible product.
- Ward Cunningham
The strength of good testing practices may be in allowing you to incur that debt with some level of safety. As long as you continue to be aware that this area of the code is now weak, as a result of your choices, then it may be worth the tradeoff -- you ship your product sooner, at the cost of higher debt, with a lower risk of incurring bugs in the short run as a result.
If the tests are good and the code (sloppy or otherwise) pass them, all is good. It would be nice to have good code but sloppy working code is better than good broken code.
I don't use tests as my first option to finding the code that needs changes. I'll use my IDE's search (or refactoring) functionality and look for all the places that call the method in question.
The tests are just a nice addition in case I was accidentally sloppy or accidentally introduced a bug. Test don't make me sloppy from the start, they just reassure me once I think I'm done.
I would say that good tests enable you to fix sloppy coding.
You can certainly write incredibly sloppy code with or without tests. Unit testing makes it slightly easier to get away with it, but only in the short run.
If you have a set of logic copied in two places in your code (IMO the worst thing a developer can do), then you probably have inconsistent tests as well.
The most important job any programmer can do is ruthlessly refactor the code, removing ALL duplication. This almost always shows benefits on even a single iteration.
Why would you think if you had an error in copied code in 2 places that your tests would be any better?
It sounds more to me like sloppy developers and sloppy coding practices are what are leading to sloppy code in your example. The tests you described would prevent the sloppy code from ever getting to far.

When/how frequently should I test?

As a novice developer who is getting into the rhythm of my first professional project, I'm trying to develop good habits as soon as possible. However, I've found that I often forget to test, put it off, or do a whole bunch of tests at the end of a build instead of one at a time.
My question is what rhythm do you like to get into when working on large projects, and where testing fits into it.
Well, if you want to follow the TDD guys, before you start to code ;)
I am very much in the same position as you. I want to get more into testing, but I am currently in a position where we are working to "get the code out" rather than "get the code out right" which scares the crap out of me. So I am slowly trying to integrate testing processes in my development cycle.
Currently, I test as I code, trying to bust the code as I write it. I do find it hard to get into the TDD mindset.. Its taking time, but that is the way I would want to work..
EDIT:
I thought I should probably expand on this, this is my basic "working process"...
Plan what I want from the code,
possible object design, whatever.
Create my first class, add a huge comment to the top outlining
what my "vision" for the class is.
Outline the basic test scenarios.. These will basically
become the unit tests.
Create my first method.. Also writing a short comment explaining
how it is expected to work.
Write an automated test to see if it does what I expect.
Repeat steps 4-6 for each method (note the automated tests are in a huge list that runs on F5).
I then create some beefy tests to emulate the class in the working environment, obviously fixing any issues.
If any new bugs come to light following this, I then go back and write the new test in, make sure it fails (this also serves as proof-of-concept for the bug) then fix it..
I hope that helps.. Open to comments on how to improve this, as I said it is a concern of mine..
Before you check the code in.
First and often.
If I'm creating some new functionality for the system I'll be looking to initially define the interfaces and then write unit tests for those interfaces. To work out what tests to write consider the API of the interface and the functionality it provides, get out a pen and paper and think for a while about potential error conditions or ways to prove that it is doing the correct job. If this is too difficult then it's likely that your API isn't good enough.
In regards to the tests, see if you can avoid writing "integration" tests that test more than one specific object and keep them as "unit" test.
Then create a default implementation of your interface (that does nothing, returns rubbish values but doesn't throw exceptions), plug it into the tests to make sure that the tests fail (this tests that your tests work! :) ). Then write in the functionality and re-run the tests.
This mechanism isn't perfect but will cover a lot of simple coding mistakes and provide you with an opportunity to run your new feature without having to plug it into the entire application.
Following this you then need to test it in the main application with the combination of existing features.
This is where testing is more difficult and if possible should be partially outsourced to good QA tester as they'll have the knack of breaking things. Although it helps if you have these skills too.
Getting testing right is a knack that you have to pick up to be honest. My own experience comes from my own naive deployments and the subsequent bugs that were reported by the users when they used it in anger.
At first when this happened to me I found it irritating that the user was intentionally trying to break my software and I wanted to mark all the "bugs" down as "training issues". However after reflecting on it I realised that it is our role (as developers) to make the application as simple and reliable to use as possible even by idiots. It is our role to empower idiots and thats why we get paid the dollar. Idiot handling.
To effectively test like this you have to get into the mindset of trying to break everything. Assume the mantle of a user that bashes the buttons and generally attempts to destroy your application in weird and wonderful ways.
Assume that if you don't find flaws then they will be discovered in production to your companies serious loss of face. Take full responsibility for all of these issues and curse yourself when a bug you are responsible (or even part responsible) for is discovered in production.
If you do most of the above then you should start to produce much more robust code, however it is a bit of an art form and requires a lot of experience to be good at.
A good key to remember is
"Test early, test often and test again, when you think you are done"
When to test? When it's important that the code works correctly!
When hacking something together for myself, I test at the end. Bad practice, but these are usually small things that I'll use a few times and that's it.
On a larger project, I write tests before I write a class and I run the tests after every change to that class.
I test constantly. After I finish even a loop inside of a function, I run the program and hit a breakpoint at the top of the loop, then run through it. This is all just to make sure that the process is doing exactly what I want it to.
Then, once a function is finished, you test it in it's entirety. You probably want to set a breakpoint just before the function is called, and check your debugger to make sure that it works perfectly.
I guess I would say: "Test often."
I've only recently added unit testing to my regular work flow but I write unit tests:
to express the requirements for each new code module (right after I write the interface but before writing the implementation)
every time I think "it had better ... by the time I'm done"
when something breaks, to quantify the bug and prove that I've fixed it
when I write code which explicitly allocates or deallocates memory---I loath hunting for memory leaks...
I run the tests on most builds, and always before running the code.
Start with unit testing. Specifically, check out TDD, Test Driven Development. The concept behind TDD is you write the unit tests first, then write your code. If the test fails, you go back and re-work your code. If it passes, you move on to the next one.
I take a hybrid approach to TDD. I don't like to write tests against nothing, so I usually write some of the code first, then put the unit tests in. It's an iterative process, one which you're never really done with. You change the code, you run your tests. If there's any failures, fix and repeat.
The other sort of testing is integration testing, which comes along later in the process, and might typically be done by a QA testing team. In any case, integration testing addresses the need to test the pieces as a whole. It's the working product you're concerned with testing. This one is more difficult to deal with b/c it usually involves having automated testing tools (like Robot, for ex.).
Also, take a look at a product like CruiseControl.NET to do continuous builds. CC.NET is nice b/c it will run your unit tests with each build, notifying you immediately of any failures.
We don't do TDD here (though some have advocated it), but our rule is that you're supposed to check your unit tests in with your changes. It doesn't always happen, but it's easy to go back and look at a specific changeset and see whether or not tests were written.
I find that if I wait until the end of writing some new feature to test, I forget many of the edge cases that I thought might break the feature. This is ok if you are doing things to learn for yourself, but in a professional environment, I find my flow to be the classic form of: Red, Green, Refactor.
Red: Write your test so that it fails. That way you know the test is asserting against the correct variable.
Green: Make your new test pass in the easiest way possible. If that means hard-coding it, that's ok. This is great for those that just want something to work right away.
Refactor: Now that your test passes, you can go back and change your code with confidence. Your new change broke your test? Great, your change had an implication you didn't realize, now your test is telling you.
This rhythm has made me speed my development over time because I basically have a history compiler for all the things I thought that needed to be checked in order for a feature to work! This, in turn, leads to many other benefits, that I won't get to here...
Lots of great answers here!
I try to test at the lowest level that makes sense:
If a single computation or conditional is difficult or complex, add test code while you're writing it and ensure each piece works. Comment out the test code when you're done, but leave it there to document how you tested the algorithm.
Test each function.
Exercise each branch at least once.
Exercise the boundary conditions -- input values at which the code changes its behavior -- to catch "off by one" errors.
Test various combinations of valid and invalid inputs.
Look for situations that might break the code, and test them.
Test each module with the same strategy as above.
Test the body of code as a whole, to ensure the components interact properly. If you've been diligent about lower-level testing, this is essentially a "confidence test" to ensure nothing broke during assembly.
Since most of my code is for embedded devices, I pay particular attention to robustness, interaction between various threads, tasks, and components, and unexpected use of resources: memory, CPU, filesystem space, etc.
In general, the earlier you encounter an error, the easier it is to isolate, identify, and fix it--and the more time you get to spend creating, rather than chasing your tail.*
**I know, -1 for the gratuitous buffer-pointer reference!*