Somehow send command line commands on windows externally and get back the response - pdf

Problem: Need to convert local html (with local images etc) to pdf from an AIX box running Universe 11.2.5 with System Builder
Current solution: FTP over html file to a Windows server which converts in batches and sends the e-mail to the destination
Proposed Solution: Do everything on the AIX box, from converting html to pdf and sending the e-mail.
Current problem: Unable to find a way to convert local html to PDF on the AIX box. I have been trying many different ways from trying to install Python3, but to no avail.

The only really difficult part of the process is getting the HTML to render into a format will properly display your html into pages that are suitable for printing. There is a fair amount of magic that goes on between HTTP:GET and clicking print on a browser window that needs to be accounted for.
I was trying accomplish something similar many moons ago on AIX but kind of ran into a skill level/time wall because I was going to have essentially create a headless browser to render the html. It looks like there are now some utilities that you might be able to leverage. I found this recent updated article on Super User that actually got me somewhat excited, especially since I don't use AIX anymore so precompiled binaries and well understood and easily attainable dependencies are something I can actually have in my life.
https://superuser.com/questions/280552/how-can-i-render-a-website-as-an-image-from-the-shell
Good Luck.

There seems to be several questions rolled into this one item.
Converting HTML to PDF, while that is just a data manipulation that you could do in basic, writing such code would be a large task. The option you use sending it to another system is valid, but put more points of failure into the system. I would think you could find code to do it on the AIX box.
Rocket plans on getting the MV Python to work on AIX, this will make the converting of html to PDF much easier since there are a lot of open source modules.
As for my suggestion of using sockets, that would be if you intend to send it to a service that will take the htms, and return the pdf document.
i.e. Is there a web service for converting HTML to PDF?
Once you have the pdf document, you can either store it in a UniVerse type-19 file, or do the base64 encoding and store it in UniVerse hash file.
Hope this helps,
Mike

Related

How can I extract the title of a pdf when downloading them?

I deal with pdfs a lot but when I try to download them it usually doesn't contain the actual title of the pdf/paper so I'll have to rename it most of the times, which I find is annoying.
In many cases URL doesn't have the title of the pdf, so I guess this has to be extracted by processing the content of the pdf. And it needs to be done on the client side, i.e., for e.g., as a browser plugin?
Is there a way that I can get the title when I'm downloading pdfs over the web via scripting or someting?
That most likely won't work and here is why.
You would have to write some incredibly dynamic code to fetch some sort of title for the PDF. You would have to have code that would scan a website, somehow pick out a title, then fire off a request to code running on your computer to change the name.
It would be somewhat inconvenient because you would always have to have the script run on your computer (likely always having terminal open).
Your code would be highly prone to error. If your website script messed up, you could accidentally name the PDF incorrectly and then not be able to find it based on how inaccurate the name is.
For now, I would suggest dealing with the pain of editing the PDF name manually.

How to create advance PDF file encryption and protection using php?

I have a problem about PDF file encryption using php.
Case: Let's say I have a local system (web based) to upload and download files, such as 4sh*red (dot) com, but it just allows PDF file. A user sign up and login to download the PDF files using his or her own personal computer. After users downloaded a PDF file from my system, the file can be viewed only on computer where they downloaded the file. But, if another user copy it (I mean: downloaded PDF file) to another computer, the file can't be viewed on that computer.
Note: I don't mean here about protecting the PDF files using password because nowadays there are a lot of softwares used to remove PDF's password protection. But, the file can't be viewed at all if copied to another computer.
Can we do that in php? If yes, do you know any algorithm to solve the case?
I really appreciate your response or answers.
Thank you.
The PDF format is an open format by Adobe. This means there are a lot of programs out there that can read it and quite same that can modify it.
If you write your own program and add some stuff to the PDF, then maybe you can do this.
Another question is - why don't you just make the document visible in the web browser to the user? Of course there's still going to be a way around for savvy users to get it, but most noobs wouldn't know how and you can easily close the simplest blocks (like right click / save).
What maybe interesting to do is what a lot of companies are doing with videos nowadays: you can dynamically add some hidden or visible 'info' to a PDF that identifies who you sent it to. In that way, if the PDF shows up somewhere else - you know who spread it.... Again - PDF is an open format, so anyone can always erase whatever you write in the main contents, so you'd have to add a hidden image to the content or something.

How to write a script that interacts with web browser and print content as PDF?

I'm looking to write an automated script that
Opens up a browser instance with a specific URL
Print the page as PDF output to a pre-defined location and document name
Simulate a click event on the web page that goes to the next report
Repeat 2 and 3 for a fixed number of times.
I'm not sure how to start doing this. Thought of using Javascript, but it won't be able to automate the printing process.
There is no control of the server, therefore I cannot use a query to get the collection of those reports.
The reason for the script is that there are many such reports, and the server can be very slow at times, it would be better to have them locally.
UPDATE: Forgot to mention that log in is required for the server.
I think scripting an off-the-shelf browser is very much the Hard Way to solve your problem. If you can at all predict the URLs for the individual report, use a command-line tool such as wget or curl to download them, and then look at this community wiki for rendering the downloaded HTML as PDF.
Or do you even need to go to PDF? If all you're interested in is having the reports available locally, why not keep them as HTML and view them in a browser (with a file: URL) rather than a PDF viewer?

Creating PDF file in PowerBuilder

I am new to PowerBuilder. I got an assignment to create a PDF file using PowerBuilder. How can I do that?
Our organization used to use Ghostscript, but has instead moved to Amyuni.
as suggested by Alberto Megia, download PDF creator, but dont use SAVE AS.
After you install pdf creator it will install a printer, use that printer to save the
datawindow with the print function.
after call print function, you will see a "Save as" dialog.
If you use "saveas" function, the pdf will not have the format that the datawindow shows.
What version of PowerBuilder are you using? The most recent versions have PDF capability built in (using Ghostscript).
Install Ghostscript.
Get PDFCreator for free there and install it.
Then you can save as PDF any datawindow or datastore with the statement:
dw_1.saveAs(path_where_to_save_with_name_of_file.pdf, PDF!, true)
Third parameter is for override if the file exists with that name. I hope it works for you.
Regards,
Alberto
We just use Ghostscript. I wrote Ghostscript setup instructions earlier. We also print Word documents we've filled in to PDF from our app by printing them to 'Sybase DataWindow PS' printer then running Ghostscript to make the PDF.
Good Question - There really isn't an easy way other than finding a third party tool. I've tried the prior method mentioned and it does work but not without headaches and you are left with deployment headaches, deploying ghost script and having to make sure Post Script drivers are on the client.
I ended up trying many PDF converters, both free and paid, the one that worked most seamlessly was one that installed as a "printer" such as if you have Adobe installed on the PC, but you need to dynamically verify existence of the printer via RegistryGet and if it doesn't exist ask user to install or install it dynamically via code, and registry entries (not fun).
After several headaches mostly related to deployment issues I ended up going with a server solution, but it requires having a server that you can have a process (distiller) running that grabs post script files and distills them to PDF. I used a response window with progress bar, the PB app printed post script file to server location upon which the distiller grabs and converts. My PB app polls the server until it finds the PDF, or the user cancels whichever comes first. With a good distiller the process is fast (< 5 seconds) which was acceptable to our users.
Upon existence of the PDF, we'd attach it to an email and send via Oracle (mapi). This solution limits the requirements on client to post script driver which in most corporate environments is there, but you need to check it via Registry. Maybe there is a better solution out there since I did this last, around 2008.
fyi- I usually don't make vendor recommendations but will in this case because there was one that stood out in ease of use and quality, it was called PDFCreator which installs as a windows printer. It looks to be open-source right now but I recall that we would have had to pay to use it in corporate environment.
Good Luck.
Use the tutorial How to use PowerBuilder to create PDF file?.

How to check if PDF was modified

I have a PDF generated by 3rd party system. Using PDF editor or els software I have modified it.
Is it possible to detect if PDF file was modified, without original file?
I will add some more details.
There is no encryption and no signature features.
Document is created by IT system. User receives document and modifies it.
Is it possible to track that change somehow?
I thought that all these applications leaves some data in PDF header or somewhere encoded inside file and it is possible to check it. However properties showed by windows explorer shows nothing... so I was interested if there is something smarter than viewing properties/header in explorer.
The problem with this is that just opening the PDF on a Mac in Preview and hitting Command-S to save the file will replace both the Creation and Modification date to match the current date/time. So even the creation date will be wrong. Even novice users can unknowingly do this, so if you're trying to track someone who may be purposefully modifying the document, it may lead to a false positive.
What you're asking is just too easy to spoof and fool unfortunately.
You could always check the md5sum of the pdf file. I'm not sure what environment you are using but that should help get you started.
It's going to be rough without the original file unless there were security features like encryption or digital signatures applied to it, which it doesn't sound like there was. Do you have access to any information at all about the original file? A file size, creation date, any of the metadata, etc.?
If the tool used to modify the PDF is working according to the PDF spec then in the Info dictionary it should update ModDate but leave CreationDate alone. You may also see some non-zero generation numbers on the objects although it is just as possible that all the objects have been regenerated and will therefore be generation 0. The trial version of CosEdit will allow you to look at these 2 items.
If however the tool has been used to intentionally modify the PDF without leaving a trace then they would be spoofing those bits of data so they won't help you.
Are the users modifying the PDF using Acrobat? If so then what Danio mentioned above should work. Strictly speaking, modifying the PDF should change its ModDate or xmp:ModifyDate without changing its CreationDate. However not all tools adhere to this; quite a few simply leave all metadata untouched, so this method of checking isn't 100% reliable unless you know what PDF editor your users employ.
If the editor your users use does change ModDate or xmp:ModifyDate, then you should be able to see it in two places. One is when you open the document in Acrobat and hit Ctrl-D to view Document Properties. The Creation field and Modified field should have different timestamps. There may also be APIs that can be used to programmatically retrieve this metadata. The other way you can visualize it is to simply open the PDF in Notepad and search for the properties. Most of the document won't be human readable but these timestamps should be. If they do get changed appropriately, you can always parse for them in your application. Good luck!
If you're using Ubuntu linux 18.04 and using Document Viewer then, you can
click on File options (3 vertical line ellipsis)
click on Properties...
look for Created / Modified fields in the Properties pop up
Beware: A sufficiently knowledgeable user can manipulate the PDF contents without changing the Created and Modified time stamps in the PDF metadata and the file system.
You can use some tools to get the pdf file property.
I use pdfinfo, you can get many property of the file, and check it.
pdfinfo 58dcc41d01293.pdf
Author: worker
Creator: Microsoft® Word 2016
Producer: Microsoft® Word 2016
CreationDate: Sat Aug 24 16:02:29 2019
ModDate: Sat Aug 24 16:02:29 2019
Tagged: yes
UserProperties: no
Suspects: no
Form: none
JavaScript: no
Pages: 55
Encrypted: no
Page size: 841.92 x 595.32 pts (A4)
Page rot: 0
File size: 3346838 bytes
Optimized: no
PDF version: 1.7