docusign PDF validation issue, is there documentation available? - pdf

We have some code which is generating a data filled PDF (fdf) file from an excel spreadsheet which is then being sent to docusign in our test environment.
Some of these work, and some come back with an error "PDF_VALIDATION_FAILED".
We have narrowed it down to the PDF document itself, and have watered down the original template to contain just four fields. We have watered down our excel spreadsheet to four basic fields using (for example) "a,1,a,2" for one input and "aa,1,a,2" as another, however one will consistently work and one will consistently fail.
Viewing the generated PDF's in a local PDF viewer (Adobe and PDF XChange Editor) the document appears fine, viewing the documents side by side in a hex/diff editor (WinMerge) shows minor differences in the streams being sent (as expected).
Is there any documentation on what validation is being performed on the PDF so we can emulate this locally and make sure our PDF's are valid before sending to the docusign API?
Thanks
Template

I am able to successfully create an envelope with the Documents you have provided.
See here for the complete CreateEnvelope request that I have used
I have used these documents that you have provided
Working PDF
Non Working PDF

Related

How do I reference a hosted docx rather sending every time when creating a pdf (document generation api)

I have asked this question on https://community.adobe.com/ and have not received an answer. If I do, I will include the response here.
I am able to create a pdf using the basic approach outlined by adobe at https://developer.adobe.com/document-services/docs/overview/document-generation-api/
Ideally, I don't want to have to send the base docx word document across in the api call each time I generate a new pdf. I would rather host the docx which can be retrieved at document generation time. One approach would be a reference url to a docx hosted on acrobat.adobe.com. At the moment I have to send the docx as well as the json data which seems inefficient.
I am using https://cpf-ue1.adobe.io/ops/:create
"cpf:inputs":{
"documentIn":{
"dc:format":"application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"cpf:location":"InputFile0"
},
I guess if it can't be done, then that's okay, I just would like to know so I can implement accordingly.
thank you

How to make a generated pdf non editable in Objective C?

I have a banking client for whom I have designed an iOS app where we will populate all the client details onto the account opening application pdf forms and generate the final pdf with all the client details. I am generating a pdf using CoreGraphics. But the pdf is editable in Adobe Acrobat Pro and they are able to edit the contents of the application form. Is there any method to restrict the editing of the pdf after it is generated from CoreGraphics? I have encrypted the pdf with a password But the client needs the pdf to be non editable.
See Protecting PDF Content
Reading between the lines a bit — because the docs are not overly clear — I think that when you create the PDF context using CGPDFContextCreate(), you pass a dictionary into its auxiliaryInfo, using the key kCGPDFContextOwnerPassword and a value that's some arbitrary password string. This encrypts the document so that only the owner (there people with that password) can work with the contents. It doesn't say it prevents editing explicitly, but I'm guessing that's implied because it list out special keys to block printing and copying (preventing editing seems like the thing one would always want when encrypting a pdf).

Using fingerprint generated by pdfjs as unique ID for a pdf

I need to create a database of different PDF files which are either uploaded by users on the server or are saved as bookmarks for the pdf files available on internet. The files available through internet are opened in pdf.js. I came across the the fingerprint that pdfjs generates for some of its operations and was wondering if I could use that to identify the pdf uniquely. But to do that I also need to generate this fingerprint myself for the documents that are uploaded but not opened via viewer.js (since I can get my hands on this fingerprint via viewer.js but not otherwise). I can use iTextSharp as pdf parser for pdf parsing but have no clue how pdfjs generates the fingerprint.
It seems pdf.js is doing the following in its fingerprint():
If available, it uses the first ID string from the PDF trailer.
If there's no ID, an MD5 hash of (part of) the byte content is calculated.
That's my quick interpretation of the current pdf.js fingerprint() source

How to make an incremental update to a PDF

I need to make an incremental update (add some existing pdf pages) to an signed pdf, making the included signature still be valid (that cover the first page).
I've seen some post's telling that is possible with PDFStamper (iTextSharp), but I'm unable to find a example out to make it append.
Changing an already signed PDF would sound imply a security leak in the PDF signing functionality/spec. The purpose of signing a PDF is a guarantee to the reader that it has not been altered by anyone other than the original author.
I think your only option is to send extra pages in a seperate PDF, or change the original PDF and have it re-signed.

How to get a pdf to display in a web browser before it's fully downloaded

I have a client that's been struggling with slow loading pdf files on the web.
My client has some very large pdf files that are almost 10 Mb. They take upwards of 3-4 minutes to download. The files will not display until the whole file is loaded.
We and they have seen other's sites where the pdfs load one page at a time, so the end user can start looking at the file as the rest of the page is still loading in the background. Gives the illusion that the page has loaded faster.
According to the documentation they see, IIS 6 should automatically do this if the pdf file is created with “Optimized for fast web view” checked. It is checked, and the file will still not load a page at a time.
They have searched and found nothing other than IIS will do this automatically if the file is saved correctly.
How can they "stream" the pdf? Is this because the pdf's were saved in a special way? Is this a java script that handles the download? Or is there a change that needs to happen in IIS?
Thanks
Update:
The file starts out like this:
%PDF-1.4
%âãÏÓ
171 0 obj << 0/Linearized 1
Linearized?
The PDF document isn't being served up from an aspx/asp page. (It's just posted directly to the site and linked to).
You need to lineraize the PDF and not trust IIS to do this for you.
There are a number of apps that will do this for you. I have used CVision (thier compression is 2nd to none, but the licensing and SDK are a pain), there is also some cheaper alternatives here, but I dont know how well they work.
To clarify Tony's point... (I think)
If you have actually used these tools and your pdf is linearized, try converting the PDF to a byte array and Response.Write() the byte array (with content headers, etc) to the client (in a new browser window or frame)
Would it be possible to use a third party service, like Scribd? If you go this route you can embed their streaming viewer onto your client's website. Just a thought, although I know it's not really suitable for every type of business.
This might happen if you are serving the PDF from an aspx page, to get the byte-serving that linearized pdf's need the page needs to be served directly or you need to provide the byte serving from the aspx code.
Save one of the files and open it up in a text editor. If you don't see something like
<< /Linearized 1.0 /L <number> /H [<number> <number>] /O <number> /E <number> ...
in the first couple hundred bytes or so, then you're not getting a linearized (ie, fast web) PDF.
First, the document needs to be "linearized", as others have explained; you can linearize it in Acrobat or using pdfopt from Ghostscript. Second, the web server must be able to serve byte ranges (i.e., support the Range header); I have no idea how to configure IIS for this, but even if the document is linearized, the client has to be able to read particular byte ranges.