How do I find out what errors a tcl command can generate? - error-handling

At the tcl try manpage, it has the following example:
try {
set f [open /some/file/name w]
} trap {POSIX EISDIR} {} {
puts "failed to open /some/file/name: it's a directory"
} trap {POSIX ENOENT} {} {
puts "failed to open /some/file/name: it doesn't exist"
}
That's great, it works, but how would I have found out that {POSIX ENOENT} is a possible trap pattern for open? The open manpage doesn't mention it. For a given arbitrary command in tcl, how do I find out what the possible errors are?

try {} trap {} is used when there is a specific error that needs to be trapped.
For a more general trap, use try {} on error {}.
try {
set fh [open myfile.txt w]
} on error {err res} {
puts "Error on open: $res"
}
There is also the catch command:
if { [catch {set fh [open myfile.txt w]}] } {
puts "error on open."
}
References: try catch

The various POSIX errors come from the OS, and you need to take a guess at the system call and look them up. For example, it's not a great reach to guess that the open command maps to the open() system call, and so it has the errors documented there. Some are vastly unlikely with Tcl (e.g., those relating to passing a bad buffer in, which is POSIX EFAULT) but we don't guarantee that the OS won't return them because the OS simply doesn't give that guarantee to us.
We ought to document the most likely ones from commands that touch the operating system, but at a high level:
the POSIX class ones are from the OS (e.g., reading a non-existent file is POSIX ENOENT), and
the TCL class ones are from Tcl's own internal code (e.g., from passing the wrong number of arguments to open, which gives you TCL WRONGARGS, or asking for too large a memory allocation, which gives you TCL MEMORY if Tcl manages to recover).
We're unlikely to exhaustively document all the possibilities (especially in the TCL class) since many are unlikely in correct code.

Related

How to check if a PDF has any kind of digital signature

I need to understand if a PDF has any kind of digital signature. I have to manage huge PDFs, e.g. 500MB each, so I just need to find a way to separate non-signed from signed (so I can send just signed PDFs to a method that manages them). Any procedure found until now involves attempt to extract certificate via e.g. Bouncycastle libs (in my case, for Java): if it is present, pdf is signed, if it not present or a exception is raised, is it not (sic!). But this is obviously time/memory consuming, other than an example of resource-wastings implementation.
Is there any quick language-independent way, e.g. opening PDF file, and reading first bytes and finding an info telling that file is signed?
Alternatively, is there any reference manual telling in detail how is made internally a PDF?
Thank you in advance
You are going to want to use a PDF Library rather than trying to implement this all yourself, otherwise you will get bogged down with handling the variations of Linearized documents, Filters, Incremental updates, object streams, cross-reference streams, and more.
With regards to reference material; per my cursory search, it looks like Adobe is no longer providing its version of the ISO 32000:2008 specification to any and all, though that specification is mainly a translation of the PDF v1.7 Reference manual to ISO-conforming language.
So assuming the PDF v1.7 Reference, the most relevant sections are going to be 8.7 (Digital Signatures), 3.6.1 (Document Catalog), and 8.6 (Interactive Forms).
The basic process is going to be:
Read the Document Catalog for 'Perms' and 'AcroForm' entries.
Read the 'Perms' dictionary for 'DocMDP','UR', or 'UR3' entries. If these entries exist, In all likelyhood, you have either a certified document or a Reader-enabled document.
Read the 'AcroForm' entry; (make sure that you do not have an 'XFA' entry, because in the words of Fraizer from Porgy and Bess: Dat's a complication!). You basically want to first check if there is an (optional) 'SigFlags' entry, in which case a non-zero value would indicate that there is a signature in the Fields Array. Otherwise, you need to walk each entry of the 'Fields' Array looking for a field dictionary with an 'FT' (Field Type) entry set to 'Sig' (signature), with a 'V' (Value) entry that is not null.
Using a PDF library that can use the document's cross-reference table to navigate you to the right indirect objects should be faster and less resource-intensive than a brute-force search of the document for a certificate.
This is not the optimal solution, but it is another one... you can to check "Sigflags" and stop at the first match:
grep -m1 "/Sigflags" ${PDF_FILE}
or get such files inside a directory:
grep -r --include=*.pdf -m1 -l "/Sigflags" . > signed_pdfs.txt
grep -r --include=*.pdf -m1 -L "/Sigflags" . > non_signed_pdfs.txt
Grep can be very fast for big files. You can run that in a batch for certain time and process the resulting lists (.txt files) after that.
Note that the file could be modified incrementally after a signature, and the last version might not be signed. That would be the actual meaning of "signed".
Anyway, if the file doesn't have a /Sigflags string , it is almost sure that it was never signed.
Note the conforming readers start reading backwards (from the end of the file) because there is the cross-reference table that says where is every object.
I advice you to use peepdf to check the inner structure of the file. It supports executing it commands over the file. For example:
$ peepdf -C "search /SigFlags" signed.pdf
[6]
$ peepdf -C "search /SigFlags" non-signed.pdf
Not found!!
But I have not tested the performance of that. You can use it to browse over the internal structure of the PDF an learn from the PDF v1.7 Reference. Check for the Annexs with PDF examples there.
Using command line you can check if a file has a digital signature with pdfsig tool from poppler-utils package (works on Ubuntu 20.04).
pdfsig pdffile.pdf
will produce output with detailed data on the signatures included and validation data. If you need to scan a pdf file tree and get a list of signed pdfs you can use a bash command like:
find ./path/to/files -iname '*.pdf' \
-exec bash -c 'pdfsig "$0"; \
if [[ $? -eq 0 ]]; then \
echo "$0" >> signed-files.txt; fi' {} \;
You will get a list of signed files in signed-files.txt file in the local directory.
I have found this to be much more reliable than trying to grep some text out of a pdf file (for example, the pdfs produced by signing services in Lithuania do not contain the string "SigFlags" which was mentioned in the previous answers).
After six years, this is the solution I implemented in Java via IText that can find any PADES signature presence on an unprotected PDF file.
This easy method returns a 3-state Boolean (don't wallop me for that, lol): Boolean.TRUE means "signed"; Boolean.FALSE means "not signed"; null means that something nasty happened reading the PDF (and in this case, I send the file to the old slow analysis procedure). After about half a million PADES-signed PDFs were scanned, I didn't have any false negatives, and after about 7 million of unsigned PDFs I didn't have any false positives.
Maybe I was just lucky (my PDF files were just signed once, and always in the same way), but it seems that this method works - at least for me. Thanks #Patrick Gallot
private Boolean isSigned(URL url)
{
try {
PdfReader reader = new PdfReader(url);
PRAcroForm acroForm = reader.getAcroForm();
if (acroForm == null) {
return false;
}
// The following can lead to false negatives
// boolean hasSigflags = acroForm.getKeys().contains(PdfName.SIGFLAGS);
// if (!hasSigflags) {
// return false;
// }
List<?> fields = acroForm.getFields();
for (Object k : fields) {
FieldInformation fi = (FieldInformation) k;
PdfObject ft = fi.getInfo().get(PdfName.FT);
if (PdfName.SIG.equals(ft)) {
logger.info("Found signature named {}", fi.getName());
return true;
}
}
} catch (Exception e) {
logger.error("Whazzup?", e);
return null;
}
return false;
}
Another function that should work correctly (I found it checking recently a paper written by Bruno Lowagie, Digital Signatures for PDF documents, page 124) is the following one:
private Boolean isSignedShorter(URL URL)
{
try {
PdfReader reader = new PdfReader(url);
AcroFields fields = reader.getAcroFields();
return !fields.getSignatureNames().isEmpty();
} catch (Exception e) {
logger.warn("Whazzup?", e);
return null;
}
}
I personally tested it on about a thousand signed/unsigned PDFs and it seems to work too, probably better than mine in case of complex signatures.
I hope to have given a good starting point to solve my original issue :)

How can I use cmake to test processes that are expected to fail with an exception? (e.g., failures due to clang's address sanitizer)

I've got some tests that test that clang's address sanitizer catch particular errors. (I want to ensure my understanding of the types of error it can catch is correct, and that future versions continue to catch the type of errors I'm expecting them to.) This means I have several tests that fail by crapping out with an OTHER_FAULT, which appears to be the fixed way that clang's runtime reports an error.
I've set the WILL_FAIL flag to TRUE for these tests, but this only seems to check the return value from a successful, exception-free failure. If the process terminates with an exception, cmake still classes it as a failure.
I've also tried using PASS_REGULAR_EXPRESSION to watch for the distinguishing messages that are printed out when this error occurs, but again, cmake seems to class the test as a failure if it terminates with an exception.
Is there anything I can do to get around this?
(clang-specific answers are also an option! - but I doubt this will be the last time I need to test something like this, so I'd prefer to know how to do it with cmake generally, if it's possible)
CTest provides only basic, commonly used interpretators for result of test programs. For implement other interpretators you can write simple program/script, which wraps the test and interpret its result as needed. E.g. C program (for Linux):
test_that_crash.c:
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
int main(int argc, char** argv)
{
pid_t pid = fork();
if(pid == -1)
{
// fork fails
return 1;
}
else if(pid)
{
// Parent - wait child and interpret its result
int status = 0;
wait(&status);
if(WIFSIGNALED(status)) return 0; // Signal-terminated means success
else return 1;
}
else
{
// Child - execute wrapped command
execvp(argv[1], argv + 1);
exit(1);
}
}
This program can be used in CMake as follows:
CMakeLists.txt:
# Compile our wrapper
add_executable(test_that_crash test_that_crash.c)
# Similar to add_test(name command), but test is assumed successfull only if it is crashed(signalled)
macro(add_test_crashed name command)
# Use generic flow of add_test() command for automatically recognize our executable target
add_test(NAME ${name} COMMAND test_that_crash ${command} ${ARGN})
endmacro(add_test_crashed)
# ...
# Add some test, which should crash
add_test_crashed(clang.crash.1 <clang-executable> <clang-args>)
There is also a clang-specific solution: configure its manner of exit using the ASAN_OPTIONS environment variable. (See https://github.com/google/sanitizers/wiki/AddressSanitizerFlags.) To do this, set the ASAN_OPTIONS environment variable to abort_on_error=0. When the address sanitizer detects a problem, the process will then do _exit(1) rather than (presumably) abort(), and will thus appear to have terminated cleanly. You can then pick this up using cmake's WILL_FAIL mechanism. (It's still not clear why OS X and Linux differ in this respect - but there you go.)
As a bonus, the test fails much more quickly.
(Another handy option that can improve turnaround time when running through cmake is to set ASAN_SYMBOLIZER_PATH to an empty value, which stops the address sanitizer symbolizing the stack traces. Symbolizing takes a moment, but there's no point doing it when running through cmake, since you can't see the output.)
Rather than do this by hand, I made a Python script that sets the environment appropriately on OS X (doing nothing on Linux), and invokes the test. I then add each asan test using a macro, along the lines of Tsyvarev's answer.
macro(add_asan_test basename)
add_executable(${basename} ${basename}.c)
add_test(NAME test/${basename} COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/wrap_clang_sanitizer_test.py -a $<TARGET_FILE:${basename}>)
set_tests_properties(test/${basename} PROPERTIES WILL_FAIL TRUE)
endmacro()
This gives a simple pass/fail as quickly as possible. I'm in the habit of investigating failures by running the test in question from the shell by hand and examining the output, in which case I get the stack trace as normal (and the fact exiting by abort is a bit slow is less of a problem).
(There are similar options for the other sanitizers, but I haven't investigated them.)

Frege putStr flushing behavior is different from Haskell or Java

Suppose you prompt for user input with a combination of putStr and getLine:
main = do
putStrLn "A line with line termination" -- printed correctly
putStr "A line without line termination, e.g. to prompt for input: " -- NOT printed
line <- getLine
putStrLn ("You entered: " ++ line)
In contrast to Haskell, Frege does not print the 2nd line (which uses putStr rather than putStrLn). Is this behavior of a missing flush intended?
If Frege deviates from Haskell behavior, I would assume it to mimic Java's behavior instead. A conceptually similar example:
public static void main(String[] args) {
System.out.println("A line with line termination");
System.out.print("A line without line termination, e.g. to prompt for input: ");
String line = new java.util.Scanner(System.in).nextLine();
System.out.println("You entered: " + line);
}
This however behaves like the Haskell variant, i.e. System.out.print gets flushed immediately.
Thanks in advance for any feedback!
PS: The (mis?)behavior can be reproduced with the latest Eclipse-Plugin as well as IntelliJ/Gradle.
Your Java code uses System.out, which is a PrintStream.
The Frege code uses a PrintWriter.
These two classes work a bit differently with respect to flushing. From the docs of PrintWriter:
Unlike the {#link PrintStream} class, if automatic flushing is enabled
it will be done only when one of the println, printf, or format methods is invoked, ..
So for your Frege code, you have to add a stdout.flush after the print to make it appear immediately.
Feel free to file an issue with the request to align Frege with the Haskell behavior in this regard. (We could leave the print as is but make the putStr add the flush automatically.)

How do I tell Octave where to find functions without picking up other files?

I've written an octave script, hello.m, which calls subfunc.m, and which takes a single input file, a command line argument, data.txt, which it loads with load(argv(){1}).
If I put all three files in the same directory, and call it like
./hello.m data.txt
then all is well.
But if I've got another data.txt in another directory, and I want to run my script on it, and I call
../helloscript/hello.m data.txt
this fails because hello.m can't find subfunc.m.
If I call
octave --path "../helloscript" ../helloscript/hello.m data.txt
then that seems to work fine.
The problem is that if I don't have a data.txt in the directory, then the script will pick up any data.txt that is lying around in ../helloscript.
This seems a bit fragile. Is there any way to tell octave, preferably in the script itself, to get subfunctions from the same directory as the script, but to get everything else relative to the current directory.
The best robust solution I can think of at the moment is to inline the subfunction in the script, which is a bit nasty.
Is there a good way to do this, or is it just a thorny problem that will cause occasional hard to find problems and can't be avoided?
Is this in fact just a general problem with scripting languages that I've just never noticed before? How does e.g. python deal with it?
It seems like there should be some sort of library-load-path that can be set without altering the data-load-path.
Adding all your subfunctions to your program file is not nasty at all. Why would you think so? It is perfectly normal to have function definitions in your script. The only language I know that does not do this is Matlab but that's just braindead.
The other alternative you have is to check that the input file argument, data.txt exists. Like so:
fpath = argv (){1};
[info, err, msg] = stat (fpath);
if (err)
error ("could not stat `%s' : %s", fpath, msg);
endif
## continue your script knowing the file exists
But really, I would recommend you to use both. Add your subfunctions in your main program, the only reason to have it on separate file is if you plan on sharing with other programs, and always check input arguments.

How to determine when AviSynth has an error message without seeing the video output

Is is there a programmatic way to test for errors in Avisynth scripts before seeing the black and red Error message in the output.
We are currently assembling Avisynth script files as a part of an automated encoding routine. When something goes wrong with Avisynth or the source file, Avisynth renders a big black and red error message. Our encoder sees this as a normal video file and keeps on encoding without raising an error.
What is the best way to check for these errors without actually being seeing the output from the video file.
AviSynth has support for try-catch: http://avisynth.org/mediawiki/Control_structures#The_try..catch_statement
I'm not sure how you would signal an error to your encoder from there. As far as I know you must return a clip from the script, and a return statement inside a try/catch block does not return from the entire script anways: http://avisynth.org/mediawiki/The_full_AviSynth_grammar#Closing_Remarks
You can however log error messages to text files, so I've seen people doing this to test an AVS script for error before running it:
script = "file_to_test.avs"
try {
Import(script)
} catch (err) {
WriteFileStart(BlankClip(), "C:\logfile.txt", script, """ ": " """, err, append=true)
}