how do password protected files work? - pdf

I was looking at an app on blackberry app world to create pdf files and that app claims to be able to password protect the files. How does one password protect a file. Isn't the code to read the file available, thus the password will be useless if the program decides not to check the password?

In addition to the other answers (which focus on encryption of arbitrary files) here an answer focusing on encryption of PDFs which was the use case initially startling the OP:
The PDF standard (ISO 32000-1) describes in section 7.6 how PDFs shall be encrypted in a manner that keeps the file structure of a PDF while hiding the content. PDFs are built from numerous objects (numbers, strings, arrays, dictionaries, streams, references, ...) and the mechanism described by the specification essentially only encrypts strings and stream contents.
Just like in the generic case described e.g. by #Mark, these encrypted string and stream contents are merely a bunch of random-looking data and have to be decrypted before the PDF can be displayed, but the remaining objects are unencrypted allowing PDF viewers and editors to recognize the file as a PDF.
Furthermore the PDF specification allows for two basic encryption types, by
a user password which anyone has to enter who wants to use the PDF in any way, and
an owner password which only needs to be entered for a configurable set of uses of a PDF (e.g. printing or editing) but not for merely viewing it.
Encryption using the latter kind of password obviously can be circumvented: After all, if you can view the PDF, you can extract all the data and do essentially what you want with unless your software co-operates with the scheme and forbids you to. And, obviously, not all software does co-operate.
Essentially the owner password mechanism stores a value in the PDF derived from the password which is sufficient to decrypt the encrypted data but does not allow for easy calculation of the original password.

Assuming the app is competently written, the .pdf file is encrypted using the password to derive the encryption key -- that is, the file is not, properly speaking, a .pdf file until it gets decrypted. Before that, the file is merely a bunch of random-looking data, and the program does not know what the decryption key is until you enter the password.

If done correctly a password protected file will be encrypted with an algorithm that needs the original password to undo the encryption. The password is used to initialize the encryption/decryption process and is not stored in the file. If you give the wrong password the decryption will not work and there is no way for the program to know the correct decryption key (except doing a brute force attack).

Related

How is an encrypted PDF document encrypted by both user and owner password?

When I encrypt a PDF document providing both a user and an owner password, I can open the document using either of those passwords.
As far as I understand, modern PDF encryption uses AES128/256 which works with one key (password).
Is the document duplicated internally and each copy encrypted with a password? Having two encrypted documents inside the file was not apparent from the encrypted document file size.
PD: I know the "user experience" differences between user and owner passwords in PDF.
Generally speaking, in cases like this, a "master key" is randomly created and used to encrypt. For each key that will actually be used to access the document, we encrypt the master key with that user or owner key. The results of these (small) encryption operations are included in the file directly (e.g. multiple copies of the encrypted master key, not document).
Thus, to decrypt the file, we need the master key, and to get the master key, we can provide either the "user" or "owner" key and use that key to decrypt one of the master key ciphertexts.
There are not two different encrypted copies of the document within a single PDF.
The user password strictly is used to limit the ability to open a document.
The master password controls the permissions of a document. E.g. Document Assembly not being permitted or Form Filling not being allowed. A master password can also be used to open a PDF in place of a user password.
The type of encryption used doesn't depend on if there is only a master password or both a user and master password.

What different options for password protection does PDF support?

In pdftk I can see three options:
1$ pdftk input.pdf output protected-userpw.pdf userpw very_secret
2$ pdftk input.pdf output protected-ownerpw.pdf ownerpw very_secret
3$ pdftk input.pdf input_pw very_secret output protected-input.pdf
When I open protected-ownerpw.pdf and protected-input.pdf I am not asked for a password. Only protected-userpw.pdf gives the expected result. What does ownerpw and input_pw do?
I use qpdf to create unprotected files from protected ones. Now I wonder if this always works. Which password protection mechanisms does the PDF format support? and what qpdf supports.
Owner and User password different meanings
In a PDF document you can set security rights, as
printing allowed
copying text allowed
filling out formfields allowed
...
These are actually only flags inside the document and it is up to the pdf viewer, if it obeys them or not.
When you set only an owner password, the user password is automatically set to empty. The PDF file is becoming still encrypted (all streams and strings will be saved in encrypted form), but it can be decrypted with the empty password. So you can open the PDF without any password input, but once opened you have only the rights to do something with the document, as specified in the security rights.
So maybe he is not allowed to print the document.
When you set an user and owner password somebody who opens the PDF needs to specify either the owner or user password during opening. If he has specified the user password, his security rights are according to the security settings in the pdf. If he has used the owner password, he gets all security rights granted.
PDF encryptions
There are many different encryption algorithms supported in the PDF. RC4 or AES with an encryption key length of 40 up to 128 and also user defined algorithms, which aren't covered inside the pdf specification. In recent extenstions to PDF 1.7 specification (extension level 3) also AES 256 was specified. In a later extension there was an update specified, which modified AES 256 a bit and fixed a theoretical security hole. Several tools still have a problem regarding these last extensions (but i don't know about qpdf).
These extensions are all available inside the normal pdf specification in the ISO specification of PDF 2.0, which was released today.

Want to customise or change the message that get displayed in Password prompt in password protected PDF

Is there a way to customise or change the message that gets displayed in the document open Password dialog box while trying to open a password protected PDF file.
Default message - "filename.pdf is protected. please enter a Document Open Password."
The message shown is completely up to the PDF viewer or processor in question.
In general you cannot prescribe it but you may create your own viewer showing the text you prefer.
PS: As the OP still hoped for a different answer (and asked essentially a duplicate question here):
The PDF specification in regard to opening password protected PDF files only rules:
If a user attempts to open an encrypted document that has a user password, the conforming reader shall first try to authenticate the encrypted document using the padding string defined in 7.6.3.3, "Encryption Key Algorithm" (default user password):
If this authentication attempt is successful, the conforming reader may open, decrypt and display the document on the screen.
If this authentication attempt fails, the application should prompt for a password. Correctly supplying either password (owner or user password) should enable the user to open the document, decrypt it, and display it on the screen.
(ISO 32000-1 section 7.6.3.1)
It does not present any mechanism to supply a message for prompting for the password.
Please note that the specification even makes prompting for a password merely a recommendation ("should", not "shall"). Completely in accord with the specification, therefore, other ways to retrieve a password might be tried instead, or such password protected documents might be ignored completely!
That been said specific PDF viewers might allow to provide a prompting message in a proprietary manner; after all the early signing mechanisms in Adobe Reader even allowed the PDF to provide appearances for successfully and for unsuccessfully verified signatures which made frauds possible! I doubt, though, that current versions of serious viewers allow providing password prompt messages even in a proprietary way.

VB.NET Encryption using a password

I have a desktop application where the user has a library of encrypted ZIP files. The configuration file holds the master password to decrypt these ZIPs. The idea is that the user enters the program password they chose when they installed the application to open the program and that password is used to decrypt the master password stored in the configuration file.
The main point of all of this security is that even if someone had access to the hard disk and user's Windows account, they still can't get inside the ZIP files without their password, ideally.
To validate the program password the user typed in I'm using the C# hashing code (converted to VB) from here (at the bottom of the page):
http://crackstation.net/hashing-security.htm
So far so good. We're only storing a hash of the program password, so a hacker couldn't read it by looking at the config file.
Now, I'm trying to implement encryption as found here:
http://msdn.microsoft.com/en-us/library/yx129kfs.aspx
Actual Question:
So which of the following gets stored as plain text in the config file to be used to decrypt the ZIP master password at runtime with the program password the user entered?
The salt used to generate the encryption key (can this be the same as
the salt used to hash the password as above?)
The initialization vector (IV)?
The encryption key? (Probably not...) (k1 in MS's example)
The decryption key? (k2 in MS's example)
In MS's example, they've got the encryption and decryption all jumbled up together... I've got a lot of pieces but I don't know how to put them together...
Update
I've read that AES encryption is more secure than the Triple DES encryption MS is using in their example above. Seeing that we're using AES on the zip files, it would be nice to use AES for the ZIP file password too.
So, how can I combine this AES example:
http://msdn.microsoft.com/en-us/library/system.security.cryptography.aescryptoserviceprovider.aspx#Y2300
With PBKDF2 to generate the encryption key?
You need to store the salt and iteration count for the PBKDF function (which must not be the same salt used to hash the password).
The key would be the result of PBKDF, which, for a given salt and iteration count, is fixed.
You also need to store the IV.
If you want to, you can use a (PBKDF) hash of the password using a third (stored) salt as the IV.

Document signed and timestamped locally and then uploaded to the server, does it have same characteristics?

Immagine a web application that lets you digitally sign (with personal digital certificates pkcs12 released by trusted CAs) and timestamp PDF documents with a Java applet or Active X. This must obviously happen on the machine of the user because the private key of the certificate is stored locally.
So once the PDF is signed and timestamped it is uploaded on the server.
Does the uploaded file have the same features of the one created locally? Does it have sense to talk about "the original version of the file"?
I'm a bit confused on this.
Correction:
i mean digitally sign a document with the private key of a personal digital certificate (should be pkcs7, pkcs12) to ensure that it has really been signed by someone and not someone else.
If by "the original version of the file" you mean that you intend to "freeze" the document so that nobody can ever make changes to it again - that is neither possible nor the purpose of a digital signature. Anyone could simply "cut out" the a signature embedded within a document, nobody would notice.
Protecting a document from subsequent modification involves some kind of DRM mechanism. For example, "watermarking" involving steganography is used to protect photos so that noone should be able to claim ownership of a photo, even after having modified it. But the technology is not very advanced yet, most algorithms can be easily broken.
This implies that the notion of "the original version of the file" in let's say a legal dispute is something that the involved parties have to agree upon in consent. There's no way to prove origin without either consent or a trusted third party that will attest the integrity of a document, e.g. if they have an independent copy of the document.
Apart from that, uploading a file should not change its contents. The file will have the exact same properties than the local one including the signature that was added on the client side.
The signature will only attest authenticity and integrity of the document. If it is vital for your application to be able to tell that the signed document received is actually the one that was expected, then I'd advise you to do the following:
Create the PDF on the server
Create a hash of the document (same algorithm that will be used by the signature applet)
Send the PDF to the client
Let the client sign it and send it back
Compare the client's hash with the one previously computed on the server
Validate the signature
Validating the signature will ensure integrity and authenticity, comparing the hashes will guarantee you that the signed document you received on the server is indeed a signed version of the original document previously created.
Concerning timestamps using local clocks: they're worthless, it's very easy to cheat. What you actually should use there is RFC 3161-compliant cryptographically secured timestamps, issued by a trusted third party. Currently that's the only reliable way to include the notion of time in PDF signatures. There's also built-in support for this in Adobe Reader for example. As these services are generally not available for free, it would make sense to add such a timestamp on the server after receiving the signed document. They are added as an unsigned attribute to the CMS (Adobe still speaks of PKCS7) signature, so it won't break the signature and can safely be added after signature creation.
Okay, let's try to answer your question (as I understand it).
You have some software which uses some private key (and a clock) to add a signature to a file.
This signature is depending on the contents of the file, and thus makes sure that the signer knew (or could have known) the contents of the file at the time it signed it. (There are some ways to have "blind signatures", but I assume this is not the case here.)
Uploading the signed file anywhere does not change anything here.
About the timestamp: The key holder can put in any timestamp it wants - so this only helps if you want to prove knowledge of the document at some point in time against the key holder, not if you are the key holder and want to prove that you signed at some point in time and not earlier or later. (Also, are you sure his clock is not skewed?)
About whether this is legally relevant, you will have to ask your lawyer. It might depend on
the jurisdiction in which the signature happened, and the one in which you want the signed document to be valid
whether the owner of the key had a chance to actually read the document before signing
whether the owner of the key had actually a choice of signing or not.
If you use some applet or ActiveX control in the user's browser, I would not be totally sure that the last two points really hold.