Converting link-like texts to links - pdf

I just noticed some very strange behaviour in my pdf.
I have a link in there (just pure text!) "www.example.com".
Now when I view this in my reader (Sumatra & Adobe) this link has a "hand" icon and I'm able to click on it. Even though I just put it as real text in there.
So I converted it to PS (pdftops ), found the text (\000w\000w\000w etc), and just changed the first "W" to "A".
Then I convert it via ps2pdf back into a PDF file, and wonder... No link anymore. (it shows now aww.example.com).
So, back to the ps file, I change the a again to w, and re-convert.
Suddenly the link is there again.
Because of this I'm pretty certain this is some feature with the reader that it's being too smart for its own good.
Which brings me to my question: how can I turn this off? Is there any comment/setting I can give which would disable this behaviour for that specific PDF file?
[the problem is that the reader is just adding "http://" in front of it to create the link, and when clicking, my reader warns me that the link is not safe. Talking about insults and injuries!]
Thanks!

Related

Download pdf - accessibility for screen readers

I'm curious how to make an accessible button for screen readers which downloads PDF.
I know that there is an option using href and pass there an URL to the pdf file, and even a download attribute inside an anchor to open a download window.
But it's not a good way for a screen reader. The screen reader reads it as a link but actually, this is not a link because it triggers downloading a pdf file rather than redirect to another page. So this can be confusing for people with vision disorders who rely on their screen readers.
Is it a good accessibility way to create such a button? Or relying on <a href='path-to-pdf'>...</a> is completely enough and not confusing for people with disabilities ?
General answer and basics of file download
Both a link and a button are perfectly fine, it doesn't make much difference.
IN any case, it's very important to explicitly indicate that the link or button is going to download a file rather than open a page, to avoid surprise.
The simplest and most reliable is just to write it textually, i.e. "View the report (PDF)".
You may also put a PDF icon next to the link to indicate it, but make sure to use a real image, i.e. <img alt="PDF" /> and not CSS stuff, since the later may not be rendered to screen readers and/or don't give you the opportunity to set alt text (which is very important).
A good practice is also to indicate the file size if its size is big (more than a few megabytes), so that users having a slow or limited connection won't get stuck or burn their mobile data subscription needlessly.
It's also good to indicate the number of pages if it's more than just a few, so that people can have an idea on how big it is, and if they really can take the required time to read it.
Example: "View the report (PDF, 44 pages, 17 MB)"
Note that similarly, that's a good practice to indicate the duration of a podcast or video beforehand.
Additional considerations with PDF
First of all, you should make sure that your PDF is really accessible. Most aren't by default, sadly.
You should easily find resources on how to proceed to make a PDF accessible if you don't know.
Secondly, for an accessible PDF files to be effectively read accessibly, it has to be opened inside a real PDF reading program which supports tagged PDFs, like Adobe Reader.
The problem is that nowadays, most browsers have an integrated PDF viewer. These viewers usually don't support tagged PDFs, and so, even if you make an accessible PDF, it won't be accessible to the user if it is open inside that integrated browser viewer.
So you must make sure that your link or button triggers an effective download or opening in a true PDF reading program, rather than opening in an integrated viewer of the browser.
Several possibilities that may or may not work depending on OS/browser to bypass the integrated viewer. They have to be tested to make sure they work:
Send a header Content-Disposition: attachment; filename="something.pdf"
Send a Content-Type different than "application/pdf" or "text/pdf", e.g. "application/octet-stream" to fake out basic type detection
Make the link don't ends with .pdf
Use the download attribute of <a>
The most reliable are response headers. Most browsers don't rely only on file extension alone to decide what to do.
Either a link or a button is fine. The most important thing is that the user is informed about what the element does - i.e. it downloads/opens a PDF file. So, this should be reflected in the element's label, whether that is a visible text label or an icon that uses alt text or aria-label to explicitly describe the element's purpose.
I agree with Quentinc's suggestion to also inform the user upfront about the number of pages and size of the document - that's a nice touch that I don't see very often!
PDF accessibility is a whole other topic, but again as QuentinC points out, there's not much good in allowing a user to download or view a PDF that isn't accessible, so it's a good idea to ensure the PDF has been tested against JAWS/NVDA/VoiceOver/TalkBack to ensure it is readable.

Blue Prism - Save and Read Pdf

I am trying to save a pdf which opens via Web link and after saving want to read all texts present in the PDF file.
I have tried to save it by sending "send keys" (CTRL+SHIFT+S) as used in BP but was not able to save it.
Also, for reading the data present in PDF(any other pdf) tried with sending key strokes CTRL+A and CTRL+C but was not successful.
Theoretically, (if you haven't done this already) you could create an object + model that attaches to the open (running).pdf instance, then with the spied element of save button/option in your .pdf, proceed from there with further elements/clicks to save it wherever you want. This should be a few clicks using Navigate stages. Same principle if you are using sendkeys; you still need to use the root element on the model that attaches/launches the .pdf. If you haven't done this, the sendkeys are just never going to work. As to capturing the contents, I am not aware of any downloadable VBOs that will do this, I know there are some from MS Word to capture stuff in tables, etc... into a Collection stage, but not for .pdf. You can try the sendkeys again once you are sure you are using the root element of the correct model, or you might have a go at creating your own solution using a code stage.

PDF downloading instead of opening in new tab

This is not a back-end programming question. I can only modify the markup or script (or the document itself). The reason I'm asking here is because all my searches for appropriate terms inevitably lead to questions and solutions about programming this functionality. I'm not trying to force it via progrmaming; I have to find out why this PDF is behaving differently.
So:
I have a bunch of links to PDFs on a page. Most of them open in new tabs, but one of them, the most recent, starts to open in a tab, but then the tab closes and the PDF gets downloaded as a file instead. All markup is consistent - there's nothing differnt about the odd-man-out except the actual URL.
You can see this here:
http://calwater.mwnewsroom.com/Investor-Relations/Financial-Reports/Annual-Reports
All annual reports up to 2012 open in a new tab, but 2013 downloads instead.
This leads me to believe that there is some meta-data property of the PDF itself that tells it how to open, and that, in this case, the 2013 PDF was created using different settings.
Apparently, the PDF was saved out to PDF from InDesign.
Does anyone have any insight?
Problem solved. There was simply an error in the string (like an extra period) that references the attachment such that it couldn't tell it was a PDF. Fixing the reference fixed the problem.

Print preview of my web page to pdf and save it on server side programmatically

I have a deceitfully difficult problem which I had thought it was easy one at the beginning and yet I have spent more than 3 days on and off in total.
what I simply need to do is to save the print preview of the page to PDF file on server side from code behind initiating by a button click.
I was expecting using an open source and then I thought there would be a code like xyzopencode.savepasgeaspdf(path) but I could not find it. I got really close to solution by saving the PDF but then I realized it did not save the picture it only saved the strings.
I tried the pdfsharp but as long as I see it draws the whole thing from scratch and I am nor sure if I can do it.
The reason I need picture compatible one is I have 3rd party signature controller on my page and my couple of attempts worked without them or any picture but when I added pictures they failed to show to picture or did not create the PDF at all. The perfect solution would be just saving what ever shows up in the print preview as PDF, just like the built in feature of Google chrome (but on server side).

Hiding the "You cannot save data typed into this form" message in Acrobat

I am embeding a PDF form on my web application. The application allows you to fill in the fields in the form, and when you are done, click on a "Submit" button, which saves whatever you've entered into the form. This functionality is working fine.
Unfortunately, Adobe Reader displays a message on top of their embeded control that says: "Please fill out the following form. You cannot save data typed into this form. Please print your completed form if you would like a copy for your records."
Now, I know what Adobe Reader is trying to tell the user. Basically, Adobe Reader will not allow you to save the contents of what you've entered into your local hard drive as a new PDF.
However, since we've added a Submit button which effectively will save what they typed within our application, and it is working. Therefore, we think this message is misleading, and would like to remove it.
I use iTextSharp in .Net for our form automation server side. I have not found a way to remove this message from the embeded forms.
Any help?
It has been a long time, but adobe has added option to hide this annoying message.
On OSX 11.0.3, Preferences>Forms>Always hide document message bar
I'm pretty sure that there is no way around this if you want to continue to use Acrobat Reader to display the PDF. This message is built into Acrobat Reader, and I am not aware of any way to override it from the outside.
Sorry, this is more in the way of a negative answer than a positive one.
There are some third-party, free, projects that are basically PDF viewers for .NET. This would allow you to get rid of the message by avoiding Acrobat Reader entirely, although this is a large amount of work just to get rid of a message.
This one is pretty comprehensive.
Another option that I'm sure you already thought of is to just build the form on the web page, instead of using the PDF. Again, a lot of extra work just to remove a message.
Adobe Acrobat (Standard and Pro) can change PDF forms to enable Adobe Acrobat Reader users to 'fill+save' form data (instead of the standard 'fill+print').
It is a special option available when saving the PDF saying "Save PDF with extended Reader functions" (or similar... I'm translating this back from German into English).
This cannot be achieved with any non-Adobe PDF creating software (unless this has licensed that function from Adobe). The technical reason for this is that Adobe uses a digital signature to protect this function, and that you'll have to agree to not reverse engineer the key when you accept the Adobe EULA. Acrobat Reader has that key compiled into its binary, and if it verified the key, it will change the message displayed to the user indicating that the form data of this document can be saved (it will also change its behaviour and indeed save the data).
Maybe this info helps you?
Switch to View > Full Screen Mode (short cut is on a mac is ⌘L).
Although this mode hides all menus and scroll bars too, I prefer it. IMHO the reader uses far too much screen real estate on junk)