Suppose I'm writing a library that stores a sequence of doubles to a file in a certain format. The format requires that the doubles are monotonically increasing.
Now, some users won't read the manual carefully or write buggy frontends that do something like
store(3.0)
store(3.1)
store(0.3)
store(7.8)
What the library could do is
Error out when store(0.3) is called.
Try to correct the error by making a good guess, e.g., actually store(3.3).
Correct the error and write a message to stderr.
[...]
The advantage of (1) would be that the user cannot miss it. If the code ran for a long time (which is the regular case in my context), though, the user wouldn't be too happy with the program aborting.
(2) would do away with this, but possibly encourage misusing the library.
Are there policies in any language that advocate one approach over the other?
Irrespective of the language used, my general advice is to always fail quickly. This localises errors to the actual source of the problem - i.e., throw an error or exception and bail out (perhaps permitting the programmer to catch the exception, depending on the language). Similarly, some languages with checked exceptions might force the programmer to add a check for malformed input.
The reason for this is simple - the further away from the actual source of the problem that the errors manifest, the harder the program is to debug. Let's say the programmer didn't mean 3.3 (as opposed to 0.3) and you corrected it for him - well, the program will keep running, but at some point the value 3.3 will manifest and potentially cause other problems. It might also be that the source of these values was some kind of sorting algorithm with bugs - the fact that your library doesn't fail in this case will simply make it harder to debug the sorting algorithm and identify the real cause of the failure.
It also plays hell with any attempts to unit test the code - code that should fail doesn't necessarily fail in the right place. This just makes the code magical and much more difficult to manage as part of a development process.
There is an alternative to simply failing and forcing the user or client program to start the interaction all over again - you could do things in a transactional manner such that the library is left in a consistent state after the failure, permitting the user to proceed from the last valid input (for example). This should be implemented with proper rollback semantics though, to ensure data consistency.
So in summary: fail fast, and fail early.
Related
I met the following excerpt in the CLR via C# book:
Important Many developers incorrectly believe that an exception is related to how frequently
something happens. For example, a developer designing a file Read method is likely to say the
following: “When reading from a file, you will eventually reach the end of its data. Since reaching the
end will always happen, I’ll design my Read method so that it reports the end by returning a special
value; I won’t have it throw an exception.” The problem with this statement is that it is being made by
the developer designing the Read method, not by the developer calling the Read method.
When designing the Read method, it is impossible for the developer to know all of the possible
situations in which the method gets called. Therefore, the developer can’t possibly know how often the
caller of the Read method will attempt to read past the end of the file. In fact, since most files contain
structured data, attempting to read past the end of a file is something that rarely happens.
I can not understand two things which the excerpt (from my pov) was intended to explain. What does it mean that an exception is related to how frequently something happens? How is it possible to prove that it is not a correct way of thinking (I believe that a counterexample does the job of proving this, but still I do not understand the counterexample presented in the above excerpt)?
I do not understand the counterexample. Ok, let someone call a method which reads from a file many times after the end of the file was reached. Ok, let the method to report the end of the file all these times. I see no reasons for this to be worse than throwing an exception.
The author is saying a developer should not attempt to guess how often a branch of code will be executed by users, and should not decide whether to throw an exception from that branch based on their guess. In other words, it is incorrect to define an exception as, "something that doesn't happen very often."
The obvious reason for not making guesses is they may be wrong. A more fundamental reason is that exceptions are not necessarily infrequent, depending on the business domain. Consider an e-commerce site where users enter credit card numbers. Users will frequently enter their card numbers incorrectly. If we related exceptions to how frequently something happens, we might determine an incorrect CC number is not an exception, because it happens quite often.
Developers may be reluctant to throw exceptions. This often results in applications that "fail slow" because error conditions propagate beyond the point where they occur. Exceptions encourage an application to fail fast.
Related: Avoid in-band error indicators.
Over the years I've written code in a variety of languages and environments, but one constant seemed to be the consensus on the use of assertions. As I understand it, they are there for the development process when you want to identify "impossible" errors and other situations to which your first reaction would be "that can't be right" and which cannot be handled gracefully, leaving the system in a state where it has no choice but to terminate. Assertions are easy to understand and quick to code but due to their fail-fast nature are unsuitable for development code. Ideally, assertions are used to discover all development bugs and then removed or turned off when shipping the code. Input or program states that are wrong, but possible (and expected to occur) should instead be handled gracefully via exceptions or other error handling techniques.
However, none of this seems to hold true for writing ABAP code for SAP. I've just spent the better part of an hour trying to track down the precise location where an assert was giving me an unintelligible error. This turned out to be five levels down in standard SAP code, which is apparently riddled with ASSERT statements. I now know that a certain variable identifying a table IS NOT INITIAL while its accompanying variable identifying a field is.
This tells me nothing. The Web Dynpro component running this code actually "catches" this assert, showing me a generic error message, which only serves to prevent the debugger from launching when the assert is tripped.
My question therefore is what the guidelines or best practices are for the use of assertions in ABAP. Is this SAP writing bad code? Is it an accepted practice to fill your custom code with asserts and leave them in when shipping the code? If so, how would we go about handling these asserts in runtime so that the application doesn't crash and burn while still being able to identify the cause of the error?
The guidelines and best practices are virtually the same in ABAP development as in any other language. Assertion should be used as internal guidance checks only, exceptions for regular input validation errors and other stuff. It might be sensible to leave the assertions in the code - after all, you'd probably rather want your program to crash in a controlled fashion than continue in an unforeseen way and probably damage some critical data in the process without anyone noticing. Take a look at checkpoint groups if you don't want your program to abort in a production environment - but in my opinion: What's the use of a sanity check (as a last line of defense) if it's disabled in the environment where it matters most?
Of course I'm assuming that the input is validated properly (so that crashes are prevented) and that all APIs are used according to the intended use and documentation. Unfortunately - as with every other programming language - it's up to the developer to live up to these standards.
When I encounter a violation in Sonar (in violation drilldown tab), in the source code view Sonar has some action like comment, assign, etc, one of those is False-positive, I want to know what exactly is the meaning of this operation, and when should I use it?
As any automatic tool, Sonar - and the rule engines it relies on (Findbugs/PMD/Checkstyle/...), can make "mistakes" while raising a violation: only a human can detect this, and you have the ability to flag this "mistake" as a false-positive to be sure that you won't spend time on it again.
Obviously, this feature must not be used to mute real violations. What's more, each time you flag a violation as false-positive, a good habit is to write a meaningful comment (and also report the issue on the user mailing list of the corresponding tool).
False-positive is then the software tells you there is a violation but you know better (like there is a reason, better than laziness, why the statement is poorly written) and this way you can mark the encounter as "Done The Right Way".
However, this functionality is sometimes used to get "clean" report for the manager. It's the worst that could happen.
Generally speaking - you should not use it.
I'd like to know if there is a Pythonic way for handling errors in long-running functions that can have errors in part that do not affect the ability of the function to continue.
As an example, consider a function that given a list of URLs, it recursively retrieves the resource and all linked resources under the path of the top level URLs. It stores the retrieved resources in a local filesystem with a directory structure mirroring the URL structure. Essentially this is a basic recursive wget for a list of pages.
There are quite a number of points where this function could fail:
A URL may be invalid, or unresolvable
The host may not be reachable (perhaps temporarily)
Saving locally may have disk errors
anything else you can think of.
A failure on retrieving or saving any one resource only affects the function's ability to continue to process that resource and any child resources that may be linked from it, but it is possible to continue to retrieve other resources.
A simple model of error handling is that on the first error, an appropriate exception is raised for the caller to handle. The problem with this is that it terminates the function and does not allow it to continue. The error could possibly be fixed and the function restarted from the beginning but this would cause work to be redone, and any permanent errors may mean we never complete.
A couple of alternatives I have in mind are:
Record errors in a list as they occur and abort processing that resource any any child resources, but continue on to the next resource. A threshold could be used to abort the entire function if too many errors occur, or perhaps just try everything. The caller can interrogate this list at the completion of the function to see if there were any problems.
The caller could provide a callable object that is called with each error. This moves responsibility for recording errors back to the caller. You could even specify that if the callable returns False that processing should stop. This would move the threshold management to the caller.
Implement the former with the latter, providing an error handling object than encodes the former's behavior.
In Python discussions, I've often noted certain approaches described as Pythonic or non-Pythonic. I'd like to know if there are any particularly Pythonic approaches to handling the type of scenario described above.
Does Python have any batteries included that model more sophisticated error handling than the terminate model of exception handling, or do the more complex batteries included use a model of error handling that I should copy to stay Pythonic?
Note: Please do not focus on the example. I'm not looking to solve problems in that particular space, but it seemed like a good example that most people here would have an understanding of.
I don't think there's a particularly clear "Pythonic/non-Pythonic" distinction at the level you're talking about here.
One of the big reasons there's no "one-size-fits-all" solution in this domain, is that the exact semantics you want are going to be problem specific.
For one situation, abort-on-first-failure may be adequate.
For another, you may want abort-and-rollback if any of the operations fails.
For a third, you may want to complete as many as possible and simply log-and-ignore failures
For a fourth alternative, you may want to complete as many as possible, but raise an exception at the end to report any that failed.
Even supporting an error handler doesn't necessarily cover all of those desired behaviours - a simple per-failure error handler can't easily provide abort-and-rollback semantics, or generate a single exception at the end. (It's not impossible - you just have to mess around with tricks like passing bound methods or closures as your error handlers)
So the best you can do is take an educated guess at typical usage scenarios and desirable behaviours in the face of errors, and design your API accordingly.
A fully general solution would accept an on-error handler that is given each failure as it happens, and a final "errors occurred" handler that gives the caller a chance to decide how multiple errors are handled (with some protocol to allow data to be passed from the individual error handlers to the final batch error handler).
However, providing such a general solution is likely to be an API design failure. The designer of the API shouldn't be afraid to have an opinion on how their API should be used, and how errors should be handled. The main thing to keep in mind is to not overengineer your solution:
if the naive approach is adequate, don't mess with it
if collecting failures in a list and reporting a single error is good enough, do that
if you need to rollback everything if one part fails, then just implement it that way
if there's a genuine use case for custom error handling, then accept an error handler as a part of the API. But have a specific use case in mind when you do this, don't just do it for the sake of it. And when you do, have a sensible default handler that is used if the user doesn't specify one (this may just be the naive "raise immediately" approach)
If you do offer selectable error handlers, consider offering some standard error handlers that can be passed in either as callables or as named strings (i.e. along the lines of the error handler selection for text codecs)
Perhaps the best you're going to get as a general principle is that "Pythonic" error handling will be as simple as possible, but no simpler. But at that point, the word is just being used as a synonym for "good code", which isn't really its intent.
On the other hand, it is slightly easier to talk about what actual forms non-Pythonic error handling might take:
def myFunction(an_arg, error_handler)
# Do stuff
if err_occurred:
if isinstance(err, RuntimeError):
error_handler.handleRuntimeError()
elif isinstance(err, IOError):
error_handler.handleIOError()
The Pythonic idiom is that error handlers, if supported at all, are just simple callables. Give them the information they need to decide how to handle the situation, rather than try to decide too much on their behalf. If you want to make it easier to implement common aspects of the error handling, then provide a separate helper class with a __call__ method that does the dispatch, so people can decide whether or not they want to use it (or how much they want to override when they do use it). This isn't completely Python-specific, but it is something that folks coming from languages that make it annoyingly difficult to pass arbitrary callables around (such as Java, C, C++) may get wrong. So complex error handling protocols would definitely be a way to head into "non-Pythonic error handling" territory.
The other problem in the above non-Pythonic code is that there's no default handler provided. Forcing every API user to make a decision they may not yet be equipped to make is just poor API design. But now we're back in general "good code"/"bad code" territory, so Pythonic/non-Pythonic really shouldn't be used to describe the difference.
Error handling should rely on exceptions and logging, so for each error raise an exception and log an error message.
Then at any caller function level catch the exception, log any other additional error if needed and handle the issue.
If the issue is not fully handled, then re-raise the exception again so that upper levels can catch the same exception and perform different actions.
In any of this stages you can keep a counter of some types of exceptions so that you can perform some actions only if there have been a specific number of issues.
In other words, do you spend time anticipating errors and writing code to get around these potential issues, or do you write the code as you see fit and then work through any errors on an issue by issue basis?
I've been thinking a lot about this lately and I'm very much a reactive person. I write my code, give it a whirl, go back correct error and repeat until application works as expected. However a friend of mine offered that he spends time thinking how each line is interpreted and fixes errors before they occur.
I must point out that re-active is pure PRE-live. I definitely make sure my application is working before it goes live.
There should always be a balance.
Too many error checking is slow and leads to garbage code. Not enough error checking makes your program crash on edge cases which is not very good to discover after having it shipped.
So you decide how reliable some piece of code should be and implement error checking accordingly. Some test utility can be not very reliable - less error checking. A COM server meant to be used by a third party search service in deep background should be super reliable - much more error checking.
I think asking this in isolation is kinda weird, and very subjective, however there are obviously a bunch of techniques that permit you to do each. I tend to use these two:
Test-driven development (this would seem to be proactive)
Strong, static typing (reactive, but part of a tight iterative development cycle, as in, it's enforced by my ML compiler, and I compile a lot)
Very occasionally I swerve into the world of formal verification of programs. That's definitely "reactive", but if you think a little more up-front, it tends to make the verification easier.
I must also say that I value a lot of up-front thought in programming. The easiest way to avoid bugs is to not write them in the first place. Sometimes it's inevitable, but often a little more time spent thinking about the problem can lead to better-quality solutions, and then the rest can be taken care of using the kinds of automated methods I talked about above.
I usually ask myself a bunch of what-ifs when coding, like
The user clicks the button, what if they didn't select a date?
The user is typing in the search box, what if they try to type html in there?
My label text depends on a value from a shared drive, what if it's not mapped?
and so on. By doing this I've found that when the application does go live, there are a ton fewer errors and I can focus on fixing more obscure bugs instead of correcting conditions that should have been in place to begin with.
I live by a simple principle when considering error-handling: garbage in, garbage out. If you don't want any garbage (e.g. invalid input) messing up your software, you have to find all the points in your software where it can get in and handle it. Of course, the more complicated your software is, the harder it is to find every point of entry, but I feel that the more you do up front the less reactive you will need to be later on.
I advocate the proactive approach.
I try to write the code in that style which results in maintainable and reliable code
I use the defensive programming techniques to prevent stupid errors in code due to my loss of attention and similar
I design the database model according to the fortress principle, SQL code checking for results after each singular operation
I think of potential problems that can happen with that part of the code and I account for that. Not for every possibility but for major ones I can think of right now.
This usually results in software operating rather smoothly. At times it even surprises me but that was the intended goal, so here we are.
IMHO, the word "Error" (or its loose synonym "bug") itself means that it is a program behavior that was not foreseen.
I usually try to design with all possible scenarios in mind. Of course, it is usually not possible to think of all possible cases. But thinking through and allowing for as many scenarios as possible is usually better than just getting something working as soon as possible. This saves a lot of time and effort debugging and redesigning the code. I often sit down with pen and paper for even the smallest of programing tasks before actually typing any code into my editor.
As I said, this will not eliminate all errors. For me it pays off many times over in terms of time spent debugging. Another benefit is that it results in a more solid and maintainable design with fewer bugfixing hacks and special cases added on later. But in any case, you will have to do a lot of debugging after the code is done.
This does not apply when all you want is a mockup or rapid prototype. Also practical constraints such as deadlines often makes a thorough evaluation difficult or impossible.
What kind of programming? It's impossible to answer this in any general way. (It's like asking "do you wear a helmet when playing?" -- well, playing what?)
At work, I'm working on a database-backed website. The requirements are strict, and if I don't anticipate how users will screw it up, I'm going to get a call at some odd hour of the day to fix it.
At home, I'm working on a program ... I don't even know what it'll do yet. I can't deal with 'errors' because I don't know what 'an error' is in this context, because I don't know what correct behavior is going to be. The entire purpose of the program can and frequently does change on a timescale of minutes to hours, so even a couple minutes spent thinking about errors this early is a complete waste of time. (It's even worse than browsing SO, since error-handling adds lines of code.)
I guess the only general answer is "I do what makes sense in terms of saving time in the long term", which is, after all, the whole reason to use machines to do work for us.