pdfbox how can i check a pdf edit permission [duplicate] - pdfbox

I have a very strange problem and I am not sure where the issue is. I am creating a PDF and not setting any security restrictions or a password. When I open the PDF in Adobe Reader DC and get the properties,it does show the Security Method as No Security. However, the Document Assembly and Page Extraction are set to Not Allowed.
The PDF was created from a Word document and I simply did a save as PDF, no other options.

In General
Please be aware that the "Document Restrictions Summary" summarizes restrictions that arise from a number of factors, the following ones coming to my mind:
Restrictions applied in the course of encryption
When encrypting a PDF, permissions for a number of features can be restricted for a regular user. Thus, if the PDF is opened with the user password, these restrictions apply and are shown in the summary; if it is opened with the owner password, they don't apply.
These are the restrictions one usually thinks of when checking the document properties Security tab.
Restrictions applied in the course of signing (certification & approval)
When a PDF is digitally signed with an integrated signature, a number of features are automatically restricted, and some more features may be restricted depending on the MDP transforms and locks applied by the signatures. These restrictions also are shown in the summary.
Restrictions applied by the viewer software used
The viewer you use may restrict what you can do with a PDF, e.g. a number of features of the Acrobat Pro editions are not present in Adobe Reader or are present but by default disabled. These restrictions also appear in the summary.
These viewer related restrictions may even differ based on the kind of document you have. E.g. in Adobe Reader they differ between PDF documents carrying a XFA form definition and those that don't.
Restrictions changed by usage rights signatures (aka Reader Enabling)
There is a special kind of PDF signature (usage rights signatures) which can lift some restrictions caused by the viewer software. If a PDF contains such a valid usage rights signature, some usually disabled features of the viewer may be enabled, a fact which also reflects in the summary.
If a PDF contains a usage rights signature which has been invalidated, e.g. by disallowed changes to the document, not only those usually disabled features remain disabled but some more features may become disabled, which again shows in the summary.
There may be additional factors still...
In Your Case
The "Not Allowed" entries you see for your file in Adobe Reader DC are restrictions of the third type listed above, they are restrictions applied by the viewer software used. If you opened the file in a superior Acrobat edition, those entries would become "Allowed".

Related

Security Method is No Security but Page Extraction and Document Assembly is not Allowed

I have a very strange problem and I am not sure where the issue is. I am creating a PDF and not setting any security restrictions or a password. When I open the PDF in Adobe Reader DC and get the properties,it does show the Security Method as No Security. However, the Document Assembly and Page Extraction are set to Not Allowed.
The PDF was created from a Word document and I simply did a save as PDF, no other options.
In General
Please be aware that the "Document Restrictions Summary" summarizes restrictions that arise from a number of factors, the following ones coming to my mind:
Restrictions applied in the course of encryption
When encrypting a PDF, permissions for a number of features can be restricted for a regular user. Thus, if the PDF is opened with the user password, these restrictions apply and are shown in the summary; if it is opened with the owner password, they don't apply.
These are the restrictions one usually thinks of when checking the document properties Security tab.
Restrictions applied in the course of signing (certification & approval)
When a PDF is digitally signed with an integrated signature, a number of features are automatically restricted, and some more features may be restricted depending on the MDP transforms and locks applied by the signatures. These restrictions also are shown in the summary.
Restrictions applied by the viewer software used
The viewer you use may restrict what you can do with a PDF, e.g. a number of features of the Acrobat Pro editions are not present in Adobe Reader or are present but by default disabled. These restrictions also appear in the summary.
These viewer related restrictions may even differ based on the kind of document you have. E.g. in Adobe Reader they differ between PDF documents carrying a XFA form definition and those that don't.
Restrictions changed by usage rights signatures (aka Reader Enabling)
There is a special kind of PDF signature (usage rights signatures) which can lift some restrictions caused by the viewer software. If a PDF contains such a valid usage rights signature, some usually disabled features of the viewer may be enabled, a fact which also reflects in the summary.
If a PDF contains a usage rights signature which has been invalidated, e.g. by disallowed changes to the document, not only those usually disabled features remain disabled but some more features may become disabled, which again shows in the summary.
There may be additional factors still...
In Your Case
The "Not Allowed" entries you see for your file in Adobe Reader DC are restrictions of the third type listed above, they are restrictions applied by the viewer software used. If you opened the file in a superior Acrobat edition, those entries would become "Allowed".

TCPDF SetProtection method is not working as expected

I started writing here:
PHP PDF password protection (no open without password)
But I can't add comments due to my reputation here (I'm better on AskUbuntu but I can't take my rep points from there). I also started a bounty there, and if someone will answer here in two days with an acceptable solution, I will award there.
Now, the problem: SetProtection method is not working as expected.
Wanted behaviour: create a protected/encrypted PDF document with TCPDF library so that the document view is always granted to everyone without asking any password, but if one tries to edit, a password is requested.
I use the following syntax:
$pdf->SetProtection(array('modify', 'copy', 'annot-forms', 'fill-forms', 'extract', 'assemble'), null, 'mypwd', 1);
I can open the file with a pdf viewer as expected.
If I try to open the file with Libreoffice Draw, the password is requested (as expected), but I'm able to edit the document BOTH with mypwd (expected) AND giving a blank password (NOT expected).
What is the right syntax, if any, to have pdf readable by everyone BUT editable ONLY with "mypwd" provided?
EDIT:
here you are with a file with a blank user password and a strong master password. Ilovepdf.com finds it UNLOCKED, Libreoffice Draw can edit it.
This is NOT the expected behaviour.
https://www.dropbox.com/s/864p8xjh1ue041z/tracking_12750_16.pdf?dl=0
As far as I can see your example PDF is encrypted just the way you wanted, with an empty user password and a non-empty owner password. Thus, TCPDF does just what it was asked to do.
Most likely the problem is that your expectation is too strong: If a program can open a PDF for reading, that program can do anything with the PDF, no matter how restricted it is configured to be. The permissions and different owner and user roles require the cooperation of the software in question, they are not technically enforced.
This already is clear from the specification:
Once the document has been opened and decrypted successfully, a PDF reader technically has access to the entire contents of the document. There is nothing inherent in PDF encryption that enforces the document permissions specified in the encryption dictionary. PDF readers shall respect the intent of the document creator by restricting user access to an encrypted PDF file according to the permissions contained in the file.
(ISO 32000-2, section 7.6.4 Standard security handler)
Apparently Libreoffice Draw simply does not behave as required by the PDF specification, i.e. it is not properly restricting user access to an encrypted PDF file according to the permissions contained in the file. Probably by design, probably just a programming glitch.
You should simply be aware that your expectation to
create a protected/encrypted PDF document with TCPDF library so that the document view is always granted to everyone without asking any password, but if one tries to edit, a password is requested.
cannot be implemented using standard PDF encryption facilities for arbitrary PDF processors, merely for those that follow the PDF specification requirement quoted above.
There are some providers of PDF DRM software solutions which are not so easy to circumvent, but I doubt any of them can withstand a determined hacker. (Unless the solution in question is not giving the PDF to the user at all but only images in a custom, webservice-based viewer; but this is not your use case.)
Depending on your actual requirements, you might want to investigate into using digital signatures instead of encryption; if your objective is to make sure that any recipient can be sure that he got your document contents and not what someone else edited into it, this appears more apropos.

We receive signed PDF documents with ulterior modifications

Maybe this one would fit better on so security? I'm not sure...
These are the facts:
We have a web app where users download a PDF document with a form, they fullfill this form, sign it with their electronic certificate and upload it back to our environment.
We've shown cases where the uploaded document is signed, but it show some fields that have been altered after the signature. If we check the integrity of PDF signatures, it shows that have been data alteration after the signature, but the signature it's fine and valid.
If we right-click on the signature and select "See signed version" we see the real data loaded on the moment of the signature.
Now, this goes against my general perception of electronic signature functionality. If any change is made to the document (or the data loaded into it) after I make a signature, this signature should become invalid, as the document has been altered.
The behaviour of the PDF seems to be different, as not only the signature still is valid, also the "default version" that you see when you open the document is the last one, not the signed one.
Now I'm wondering
Is this some kind of bug or is a expected behaviour?
There is any place where info on the matter can be found? (google keeps redirecting me once and again to "how to sign a PDF" articles).
If this is a defined behaviour, how do you deal with it?
Now, this goes against my general perception of electronic signature functionality. If any change is made to the document (or the data loaded into it) after I make a signature, this signature should become invalid, as the document has been altered.
The behaviour of the PDF seems to be different, as not only the signature still is valid, also the "default version" that you see when you open the document is the last one, not the signed one.
Is this some kind of bug or is a expected behaviour?
It is expected behavior.
You have to be aware of two special factors here:
A PDF signature field contains the information of the byte ranges signed. Obviously not the whole file can be signed as the signature itself is embedded and cannot be part of the signed bytes. Thus, the signed bytes ranges need to be recorded somewhere. Cf. this answer on Information Security Stack Exchange:
Additions to a PDF can be made by appending to the existing document, a process called an incremental update. These updates can again be signed etc., also cf. the answer referenced above:
Thus, making changes to a PDF by means of an incremental update, the existing integrated signatures in the document still correctly sign their respective signed by range. They still are mathematically valid in spite of the added changes.
Furthermore the current contents of a PDF are defined in particular by the newest incremental update, so when you open the document it shows the content including the last changes, not the signed one.
Now, while this sounds like PDF signatures have no meaning, this is not the case. The specification ISO 32000-1 clearly defines which changes are allowed to be made in an incremental update to a certified (= signed with some special flags) base version of a document, and Adobe in their Acrobat and Reader software have extrapolated restrictions from this for signed but not certified documents, cf. this answer on stack overflow.
In particular at most the following changes are allowed:
Adding signature fields
Adding or editing annotations
Supplying form field values
Digitally signing
If this is a defined behaviour, how do you deal with it?
As the documents originate from you, you can start by applying a certificate signature to the document which only allows as little changes as possible in your use case.
Then you can define signature lock information for the signature fields your users are to sign. In these lock information you can e.g. prescribe that after signing the given signature field, a number of form fields shall be read-only.
Finally you only accept back PDFs which still contain your certification signature and to which no disallowed changes were added.
There actually are numerous PDFs which are certified and contain a number of fields for additional approval signatures, and each of the approval signature fields is coupled with some form fields which will not be editable anymore after signing. After all the signature fields are signed, all fields are read-only.
There is any place where info on the matter can be found? (google keeps redirecting me once and again to "how to sign a PDF" articles).
You should in particular look at the PDF specification ISO 32000-1 and some Adobe documents on the behavior of their software. You'll find links at the bottom of the stack overflow documentation page the above mentioned links point to.

Merging write protected pdfs using itextsharp [duplicate]

I want to merge several PDF documents into one. The source documents can consist of PDFs created by me and others created by other organisations. I have no control over the permissions attached to documents not created by me. Some of these documents (those not created by me) may have permissions set. If a document requires a password to open it I do not attempt to merge it.
I am using iText 5.5.1 (I think that is the latest) to create a PDFCopy object to contain the resulting document and a reader for each source PDF in a loop (I am passing a list of the documents to be merged). I check each document for the number of pages and then using the PDFCopy object import each page and then add it to the PDFCopy object (the reason these two steps are separate is due to the intricacies of the language I am using to work with the java objects, RPG on an IBM iSeries). The problem is I can attach a reader to a PDF with permissions and get the page count, but as soon as I try to import a page into the copy object the program complains and terminates with the message 'PdfReader not opened with owner password'. I am not able to get the person(s) providing the documents from other organisations to not protect the documents (there a very, very good reasons why the original document is protected from change) but I need to consolidate these documents into one.
My question is, can I copy PDF's with permissions into a new document using iText and can I do it without knowing the owner password? In addition to that I guess the other question would be, is it legal?
Thanks
GarryM
Introduction: A PDF file can be encrypted using a public certificate. If you have such a PDF, you need the corresponding private certificate to decrypt it. A PDF file can be encrypted using two passwords: a user password and an owner password. If the PDF is encrypted using a user password, you need at least one of the two passwords to decrypt it.
Assumption: I assume that the PDFs are encrypted with nothing but an owner password. You can open these documents in a PDF viewer without having to provide a user password, which means the content can be accessed, but there are some restrictions in place depending on the permissions that are set.
Situation: iText is a library that allows you to access PDFs at a very low level, without a GUI. It can easily access a PDF that is encrypted with nothing but an owner password, but it can't check if you respect the permissions that are defined for the PDF. To make sure that you are aware of your responsibilities, an exception is thrown saying PdfReader not opened with owner password. This is often too strict: sometimes you have the permission to assemble a PDF file, but with iText it's all or nothing. Either you can open the file, or you can't. iText doesn't check what you're doing afterwards.
Solution: There is a static Boolean parameter called unethicalreading that is set to false by default. You can change it like this:
PdfReader.unethicalreading = true;
--EDIT (since iText 7):
pdfReader.setUnethicalReading(true);
From now on, it will be as if the PDFs aren't encrypted.
Is this legal? It's not that clear and I am not a lawyer, but:
It used to be illegal when Adobe still owned the copyright on the PDF specification. Adobe granted the right to use that copyright to any developer on certain conditions. One of these conditions was that you didn't "crack" a PDF. Removing the password from a PDF broke your "contract" with Adobe to use the PDF specification and you risked being sued.
This changed when Adobe donated the PDF specification to the community in order to make it an ISO standard. Now every one can use this international standard, and the above (risk of being sued by Adobe for infringing the copyright) no longer exists.
As the ISO standard documents the mechanism of encryption with an owner password and it is very easy to use the ISO standard to decrypt a document without having that password, the concept of introducing an owner password to enforce permissions is flawed from a technical point of view. It's merely a psychological way to prevent people to do something with your document that you, as an author, do not want.
It's like a stop sign on a deserted road. It says: you should stop here, but nobody/nothing is going to stop you if no one is around.
Suggested approach:
My approach is to decrypt the PDF using the unethicalreading parameter, and to look at the permissions that are set. If the permissions don't allow assembly, I refuse the document. I also set permissions on the resulting PDF where I try to find the combination of permissions that respect the permissions set on the original documents.
In some cases, it's not that hard: the people don't know the PDFs are often the owners of the documents who forgot the passwords that were used to encrypt them. In that case, simple permission of the owners of the documents is sufficient to decrypt them.
Final remark: I'm the original developer of iText and I'm responsible for introducing the unethicalreading parameter. I've chosen the name unethicalreading only to make sure people are aware of what they are doing. It doesn't mean that using that parameter is always unethical or illegal.

Can I use iText to merge PDF's with Permissions

I want to merge several PDF documents into one. The source documents can consist of PDFs created by me and others created by other organisations. I have no control over the permissions attached to documents not created by me. Some of these documents (those not created by me) may have permissions set. If a document requires a password to open it I do not attempt to merge it.
I am using iText 5.5.1 (I think that is the latest) to create a PDFCopy object to contain the resulting document and a reader for each source PDF in a loop (I am passing a list of the documents to be merged). I check each document for the number of pages and then using the PDFCopy object import each page and then add it to the PDFCopy object (the reason these two steps are separate is due to the intricacies of the language I am using to work with the java objects, RPG on an IBM iSeries). The problem is I can attach a reader to a PDF with permissions and get the page count, but as soon as I try to import a page into the copy object the program complains and terminates with the message 'PdfReader not opened with owner password'. I am not able to get the person(s) providing the documents from other organisations to not protect the documents (there a very, very good reasons why the original document is protected from change) but I need to consolidate these documents into one.
My question is, can I copy PDF's with permissions into a new document using iText and can I do it without knowing the owner password? In addition to that I guess the other question would be, is it legal?
Thanks
GarryM
Introduction: A PDF file can be encrypted using a public certificate. If you have such a PDF, you need the corresponding private certificate to decrypt it. A PDF file can be encrypted using two passwords: a user password and an owner password. If the PDF is encrypted using a user password, you need at least one of the two passwords to decrypt it.
Assumption: I assume that the PDFs are encrypted with nothing but an owner password. You can open these documents in a PDF viewer without having to provide a user password, which means the content can be accessed, but there are some restrictions in place depending on the permissions that are set.
Situation: iText is a library that allows you to access PDFs at a very low level, without a GUI. It can easily access a PDF that is encrypted with nothing but an owner password, but it can't check if you respect the permissions that are defined for the PDF. To make sure that you are aware of your responsibilities, an exception is thrown saying PdfReader not opened with owner password. This is often too strict: sometimes you have the permission to assemble a PDF file, but with iText it's all or nothing. Either you can open the file, or you can't. iText doesn't check what you're doing afterwards.
Solution: There is a static Boolean parameter called unethicalreading that is set to false by default. You can change it like this:
PdfReader.unethicalreading = true;
--EDIT (since iText 7):
pdfReader.setUnethicalReading(true);
From now on, it will be as if the PDFs aren't encrypted.
Is this legal? It's not that clear and I am not a lawyer, but:
It used to be illegal when Adobe still owned the copyright on the PDF specification. Adobe granted the right to use that copyright to any developer on certain conditions. One of these conditions was that you didn't "crack" a PDF. Removing the password from a PDF broke your "contract" with Adobe to use the PDF specification and you risked being sued.
This changed when Adobe donated the PDF specification to the community in order to make it an ISO standard. Now every one can use this international standard, and the above (risk of being sued by Adobe for infringing the copyright) no longer exists.
As the ISO standard documents the mechanism of encryption with an owner password and it is very easy to use the ISO standard to decrypt a document without having that password, the concept of introducing an owner password to enforce permissions is flawed from a technical point of view. It's merely a psychological way to prevent people to do something with your document that you, as an author, do not want.
It's like a stop sign on a deserted road. It says: you should stop here, but nobody/nothing is going to stop you if no one is around.
Suggested approach:
My approach is to decrypt the PDF using the unethicalreading parameter, and to look at the permissions that are set. If the permissions don't allow assembly, I refuse the document. I also set permissions on the resulting PDF where I try to find the combination of permissions that respect the permissions set on the original documents.
In some cases, it's not that hard: the people don't know the PDFs are often the owners of the documents who forgot the passwords that were used to encrypt them. In that case, simple permission of the owners of the documents is sufficient to decrypt them.
Final remark: I'm the original developer of iText and I'm responsible for introducing the unethicalreading parameter. I've chosen the name unethicalreading only to make sure people are aware of what they are doing. It doesn't mean that using that parameter is always unethical or illegal.