Verify that a digital signed PDF content is the same as the original - pdf

I work on an app where the user has to download a PDF document, sign it with a trusted certificate, and upload this signed version back to the app.
I would like to check that the uploaded signed PDF is the same as the one the user downloaded in the first place. (To avoid that the user uploads another signed PDF).
I tried to compare the hash of the original PDF with the truncated version of the signed one (without the signature part) but it does not work if the signature is not added in "append mode".
I also read digital signing a PDF might rearrange the binary contents of your document so impossible to compare with the original.
What do you think are the possibilities to deal with this ?

Related

Digital Signature/ eSign verification fails

I have a eSigned / Digitally signed PDF, the document is signed using iText lib in detached signature. I am having issue while verifying the signature, getting message "signers identity is invalid because it has expired or not yet valid" and in signer info "There were errors building the path from signers certificate to an issuer certificate."
I have tried many ways to validate the signature but couldn't get any success. If I explicitly add signers certificate as trusted certificate then I get a green check and able to verify the signature but I think thats not the correct way to do it.
Adobe Signature setting are as follow:-
digitally signed pdf can be found heredigitally signed document
can anyone please help to resolve this issue.
I have tried many ways to validate the signature but couldn't get any success.
That is not surprising: Your signature is not valid. In the PDF signature field value there is a signing time 2020/06/11 09:28:35 GMT but your signer certificate is valid not before 2020/06/11 09:29:44 GMT and not after 2020/06/11 09:59:44 GMT. At the claimed signing time, therefore, your signer certificate was not valid yet and could not create a valid signature.
Apparently you sign using a signing service that creates a short-time certificate just in time when your signature request to it arrives. Unfortunately that is after the time iText stored as signing time in the PDF at the beginning of the signing process.
Thus, one way to resolve the issue is to tell iText to use a time slightly (e.g. two minutes) in the future.
You can do that by means of the PdfSignatureAppearance method setSignDate.
Your signature actually also violates a recommendation from the specification: The afore mentioned signing time stored in the signature field value dictionary should be used only when the time of signing is not available in the signature (the embedded signature container). In your case, though, the embedded signature container does contain a signingTime signed attribute with value 11/06/2020 09:29:44 GMT which is not before the start of certificate validity.
As that only is a recommendation, your PDF signature is not made invalid by having two signing time values. But as the values differ, this can result in different verification results by different validators, using either one or the other value.
Thus, another way to resolve the issue is to make sure no signing time value is added to the signature value dictionary at all. This additionally makes your signed PDF follow the recommendation above and so be more precise.
Unfortunately using the PdfSignatureAppearance method setSignDate to set a null value does not work, later on in the signing process this results in NullPointerException occurrence. But if you use a custom ExternalSignatureContainer implementation in your signing code (and not an ExternalSignature implementation), you can remove that entry in your modifySigningDictionary implementation. The key is PdfName.M.
As an aside, as your signing certificate is only valid for half an hour, validators may also reject it if their validation policy only trusts digital time stamps, not unsecured date-time values.
Thus, you should add revocation information and digitally time stamp the whole construct during the life time of the certificate to guarantee long term validation.
I understand what your issue is. There's a whole new way of solving it, in 3 steps.
Open in Adobe Acrobat DC or any Adobe pdf editor software you have.
In Edit, go to Preferences, in the end come to Signatures. Click on verification, more, in verification time section, select current time. Same location, verification information section, select never.
In Windows integration, tick validating signatures and validating certified documents. Click ok.
Close all popups. Come back to signature area and right click to select, validate signature.
Done ✅

iTextSharp not rendering digital signature

I've read the I-TEXT digital signature e-text, and also previous posts answered by MKL (who seems to be the authority along with Bruno on this topic).
essentially I have an Azure app service that gets the digital sig (base 64) and certificate chain from the company's signing API. The company's signing API returns a signature in Base64 along with a certificate chain.
I just want the to insert a signature object/container into the pdf so it will show in the signature panel when an end user opens up the pdf. I prefer to use deferred signing.
I've shifted from chapter 4's "clientseversigning example" to instead Deferred Signing in MKL's "How to create a PDF signature without knowing the signer certificate early".
The Company API returns a "plain" signature, that I am pretty sure, and also returns a chain of 3 string certificates.
I should note I do have the root and sub certs in advance (2 .cer files) but I am not using them in "prepping" the pdf for hashing right now since the deferred signing example doesn't make use of them obviously. For the container construction code (after getting the response from the Company API), I use the 3 certs chain returned from the company API, but I also tried it with the 2 .cer files, to no avail.
The only difference between my code and the one in the example is instead of byte[] certificateBytes = THE_RETRIEVED_CERTIFICATE_BYTES; X509Certificate x509Certificate = new X509CertificateParser().ReadCertificate(certificateBytes); I build 3 x509Certificates (one for each string in the chain returned from the Company API.
Sadly things wont work, I get these errors in Acrobat: Signature is invalid, There are errors in the formatting or information contained in this signature, signature's identity has not yet been verified, signing time is from the clock on the signer's computer...also if I click Certificate details just below of this error in Acrobat it is blank. This was pretty much the errors I was getting when trying the "clientserversigning example"
I am trying really hard and wondering what it could be... should I try modifying the estimated size from 12000 and bump it up? or the errors I am getting in Acrobat, maybe they are hinting the certificate chain from the Company API is not being picked up by the signing deferral container construction code ... I am struggling but any tips would be so greatly appreciated
Evan
Just to clarify, I am following chapter 4's clientserversigningexample but I am getting the following once my pdf is recreated with the signature from the company API
Its saying 1) there are errors in the formatting of information
2) signer's identity has not been verified
3) signing time is from the clock on the signer's computer
now as far as "prepping" the pdf before hashing it to send for signing...I don't see anything in the ClientSigning example that specifically preps it, can I assume the IText library is prepping it under the hood?
In your question and in your comments to it you appear to be in particular interested in
whether or not one can use a signature API that returns the certificates only together with the signature, and
when to to deferred signing.
Can You Use a Signature API That Provides the User Certificate Only After Signing
To answer this one first has to clarify what kind of signatures the signature API in question creates, plain signature values (e.g. PKCS#1 RSA signatures) or full-fledged CMS signature containers.
If it creates full-fledged CMS signature containers, you can create signatures following arbitrary signature profiles as long as the signature containers follow the requirements for them (which they often do). They only restriction you have is that you cannot have information from the signer certificate in the signature visualization because that visualization is defined in the signed data of the PDF.
If it only creates plain signature values, the best you can do is create and embed simple CMS containers that don't contain pointers to the signer certificate in the signed attributes (if they have any signed attributes as all to start with). Many signature policies of interest do require such pointers, but at least Adobe Reader accepts signatures without.
If you are in this situation and want to try creating signatures with such simple signature containers, you may want to use the code from this answer, section "How to create a PDF signature without knowing the signer certificate early".
When to Use Deferred Signing
The difference between deferred signing and other iText signing calls is not that deferred signing requires less information (compared to ExternalContainer signing).
In contrast to the other iText signing methods, signDeferred re-uses the outermost existing, filled signature field of the PDF to sign and merely replaces the signature container therein.
The method name is derived from the most common use case it’s used for:
In a first step a signature field is (probably first created and then) filled using signExternalContainer with an IExternalSignatureContainer implementation that calculates the document hash to be signed but does not return the final CMS container yet, merely some (usually) empty array. The generated PDF with a filled signature field (albeit without the final signature container) is then temporarily stored (in the file system or a database).
In a second step a signature container for the determined document hash is requested and (probably asynchronously) awaited.
In a final step signDeferred is used to inject the retrieved signature container into the PDF prepared in the first step.
This deferred signing process usually is preferred in setups in which the second step, the signature container creation and retrieval, can take longer than one wants to keep the resources blocked which are required for the document in signing. This includes in particular signatures generated by remote servers or clients, especially if that signing process awaits some third party clearance or activation.

Is necesary to add empty signature appearance on a PDF to be signed by multiple users

Let's say we have generated a PDF to be signed by two users. The signing process will handled on server side, so the PDF file will be updated (digital signature & timestamp) on server side. No external applications (like adobe reader or pro) will be involved on this process.
Do I need to create two "empty signature appearance" on this PDF file?

how do password protected files work?

I was looking at an app on blackberry app world to create pdf files and that app claims to be able to password protect the files. How does one password protect a file. Isn't the code to read the file available, thus the password will be useless if the program decides not to check the password?
In addition to the other answers (which focus on encryption of arbitrary files) here an answer focusing on encryption of PDFs which was the use case initially startling the OP:
The PDF standard (ISO 32000-1) describes in section 7.6 how PDFs shall be encrypted in a manner that keeps the file structure of a PDF while hiding the content. PDFs are built from numerous objects (numbers, strings, arrays, dictionaries, streams, references, ...) and the mechanism described by the specification essentially only encrypts strings and stream contents.
Just like in the generic case described e.g. by #Mark, these encrypted string and stream contents are merely a bunch of random-looking data and have to be decrypted before the PDF can be displayed, but the remaining objects are unencrypted allowing PDF viewers and editors to recognize the file as a PDF.
Furthermore the PDF specification allows for two basic encryption types, by
a user password which anyone has to enter who wants to use the PDF in any way, and
an owner password which only needs to be entered for a configurable set of uses of a PDF (e.g. printing or editing) but not for merely viewing it.
Encryption using the latter kind of password obviously can be circumvented: After all, if you can view the PDF, you can extract all the data and do essentially what you want with unless your software co-operates with the scheme and forbids you to. And, obviously, not all software does co-operate.
Essentially the owner password mechanism stores a value in the PDF derived from the password which is sufficient to decrypt the encrypted data but does not allow for easy calculation of the original password.
Assuming the app is competently written, the .pdf file is encrypted using the password to derive the encryption key -- that is, the file is not, properly speaking, a .pdf file until it gets decrypted. Before that, the file is merely a bunch of random-looking data, and the program does not know what the decryption key is until you enter the password.
If done correctly a password protected file will be encrypted with an algorithm that needs the original password to undo the encryption. The password is used to initialize the encryption/decryption process and is not stored in the file. If you give the wrong password the decryption will not work and there is no way for the program to know the correct decryption key (except doing a brute force attack).

Document signed and timestamped locally and then uploaded to the server, does it have same characteristics?

Immagine a web application that lets you digitally sign (with personal digital certificates pkcs12 released by trusted CAs) and timestamp PDF documents with a Java applet or Active X. This must obviously happen on the machine of the user because the private key of the certificate is stored locally.
So once the PDF is signed and timestamped it is uploaded on the server.
Does the uploaded file have the same features of the one created locally? Does it have sense to talk about "the original version of the file"?
I'm a bit confused on this.
Correction:
i mean digitally sign a document with the private key of a personal digital certificate (should be pkcs7, pkcs12) to ensure that it has really been signed by someone and not someone else.
If by "the original version of the file" you mean that you intend to "freeze" the document so that nobody can ever make changes to it again - that is neither possible nor the purpose of a digital signature. Anyone could simply "cut out" the a signature embedded within a document, nobody would notice.
Protecting a document from subsequent modification involves some kind of DRM mechanism. For example, "watermarking" involving steganography is used to protect photos so that noone should be able to claim ownership of a photo, even after having modified it. But the technology is not very advanced yet, most algorithms can be easily broken.
This implies that the notion of "the original version of the file" in let's say a legal dispute is something that the involved parties have to agree upon in consent. There's no way to prove origin without either consent or a trusted third party that will attest the integrity of a document, e.g. if they have an independent copy of the document.
Apart from that, uploading a file should not change its contents. The file will have the exact same properties than the local one including the signature that was added on the client side.
The signature will only attest authenticity and integrity of the document. If it is vital for your application to be able to tell that the signed document received is actually the one that was expected, then I'd advise you to do the following:
Create the PDF on the server
Create a hash of the document (same algorithm that will be used by the signature applet)
Send the PDF to the client
Let the client sign it and send it back
Compare the client's hash with the one previously computed on the server
Validate the signature
Validating the signature will ensure integrity and authenticity, comparing the hashes will guarantee you that the signed document you received on the server is indeed a signed version of the original document previously created.
Concerning timestamps using local clocks: they're worthless, it's very easy to cheat. What you actually should use there is RFC 3161-compliant cryptographically secured timestamps, issued by a trusted third party. Currently that's the only reliable way to include the notion of time in PDF signatures. There's also built-in support for this in Adobe Reader for example. As these services are generally not available for free, it would make sense to add such a timestamp on the server after receiving the signed document. They are added as an unsigned attribute to the CMS (Adobe still speaks of PKCS7) signature, so it won't break the signature and can safely be added after signature creation.
Okay, let's try to answer your question (as I understand it).
You have some software which uses some private key (and a clock) to add a signature to a file.
This signature is depending on the contents of the file, and thus makes sure that the signer knew (or could have known) the contents of the file at the time it signed it. (There are some ways to have "blind signatures", but I assume this is not the case here.)
Uploading the signed file anywhere does not change anything here.
About the timestamp: The key holder can put in any timestamp it wants - so this only helps if you want to prove knowledge of the document at some point in time against the key holder, not if you are the key holder and want to prove that you signed at some point in time and not earlier or later. (Also, are you sure his clock is not skewed?)
About whether this is legally relevant, you will have to ask your lawyer. It might depend on
the jurisdiction in which the signature happened, and the one in which you want the signed document to be valid
whether the owner of the key had a chance to actually read the document before signing
whether the owner of the key had actually a choice of signing or not.
If you use some applet or ActiveX control in the user's browser, I would not be totally sure that the last two points really hold.