Add watermark to various documents investigation - pdf

I've been asked to investigate the feasibility of adding watermarks to documents when printed through our application. The documents will consist of word, pdf and cad.
The interface of the application is vb6 with a plethora of vc6 dll's.
I can see a couple of possible solutions:
Convert all documents to PDF, add a watermark and then print.
Find a print driver that will add a watermark to all documents prior to printing and install it and reenable it at runtime if it gets disabled for any reason.
3rd Party suites are possibility (we use Volo View Express for viewing CAD files) but since this application is nearing end-of-life we wouldn't want to spend too much on it.
Has anyone had any experience of the above? Any gotcha's that will bog me down?

Tracker Software has a good set of PDF api's that that will allow you to implement the solution you already have in mind. I've used their Image and PDF libraries quite a bit with a lot of success in both VB6 and .NET. Single user licenses are not expensive (depending on how you look at it I guess), and I've found support to be excellent as well.

Related

SOLVED: Looking for a way to automate generation of internal PDF hyperlinks

I have a 300+ page PDF document which needs to have internal page links added to it to reference other pages in the document. The document is created in Visio, which does not support consistent hyperlink generation in PDF export, so the link generation needs to be done on the PDF itself, not up the chain. This is an annual need, and regularly takes over a week due to the amount of manual labor, time, and checking needed.
The text which is hyperlinked has the same format in every case (e.g., "See Section 8.18 - How to Hyperlink"), and I'm certain this can be automated, as there are commercial plugins which can do this, but they cost hundreds of dollars, and are not able to be used in this case due to restrictions imposed by my employer. Example: https://www.evermap.com/ABAddingHyperlinks.asp
I've been looking through the Acrobat Plugin SDK and it seems doable, but I know there is also a higher level scripting language available for Acrobat. Does anyone have experience working with PDFs or with the Acrobat scripting / SDK tools? Are there open source methods for doing this? I've looked everywhere! Willing to learn. I've looked at Ghostscript (Adding internal hyperlink to a pdf) but what I need is way more than just a Table of Contents, and links can appear in many places on the page with line breaks, so consistency is a challenge.
EDIT: I found a solution! Bluebeam software's Revu Extreme works pretty darn well, and can be used as a 30 day free trial of all features. Only limitation is that links which extend across a line break (multiple lines of text) do not properly work in Edge or Chrome's PDF viewer, as they don't properly support hyperlinks with multiple click regions. I've submitted a ticket requesting a feature be added to Revu that fixes this, but for now those links need to be manually fixed following the batch link. The process is described here: https://support.bluebeam.com/online-help/revu2018/Content/RevuHelp/Menus/Batch/Link/Batch-Link--T.htm
EDIT: I found a solution! Bluebeam software's Revu Extreme works pretty darn well, and can be used as a 30 day free trial of all features. Only limitation is that links which extend across a line break (multiple lines of text) do not properly work in Edge or Chrome's PDF viewer, as they don't properly support hyperlinks with multiple click regions. I've submitted a ticket requesting a feature be added to Revu that fixes this, but for now those links need to be manually fixed following the batch link. The process is described here: https://support.bluebeam.com/online-help/revu2018/Content/RevuHelp/Menus/Batch/Link/Batch-Link--T.htm
You can add hyperlinks to a document with Ghostscript, but you would need to know the location of the text to hyperlink and the destination in advance, you cannot automate it or in fact write any reasonably simple code to automate the task using Ghostscript. You'd need to modify chunks of the PDF interpreter, which is written in PostScript and is not a task for anyone not a PostScript expert.
You could probably do it with MuPDF, and probably using MuJS to script it, but I don't know enough to be certain. It would still require some coding effort, but it would probably be easier to use JavaScript at least.

Extract Font from PDF using GdPicture

According to their website (http://www.gdpicture.com/products/managed-pdf/) you have the ability to extract fonts from a PDF file. However, I can't seem to find the functionality to do this. I have encountered several methods to add them, but none to extract them (and they don't show as embedded files). Has anyone tried to do this, or have experience with GdPicture?
Version: 14 (Current)
Disclosure: I am part of the ORPALIS technical staff that edits the GdPicture.NET SDK, that's why I know there's an ongoing communication about this already.
It is my understanding that you have a support case open for a merging issue relative to fonts and as you know, our development team is currently working on a fix that will solve it so I strongly recommend that you wait for them to finish.
There's no extraction of the embedded font as you might expect at the moment but the development team is also working on one, we will let you know as soon as it is available (it should be very soon).
You can get information about (already) embedded fonts using the GetFontCount, IsFontEmbedded, GetFontName and GetFontType methods.
You can also add new embedded fonts (of different types) using the AddFontFromFileU, AddStandardFont, AddTrueTypeFont, AddTrueTypeFontFromFile, AddTrueTypeFontFromFileU and AddTrueTypeFontU methods.

Multimedia PDF viewer for iOS

I want to implement an iOS application which views PDF files. I have used vfr/Reader in some other applications before. But now i need to display multimedia supported PDF files on iOS which include videos, animations etc. My customer create these PDFs by using InDesign.
I made a research and can not found a proper iOS based framework to achieve this. There are really limited number of solutions like Adobe, FastPDFKit etc but they are so expensive and there is no "one time fee" option.
Do you have any open source suggestions or the ones with lower prices?
EDIT: Made a research for days and days but there is no solution. Is there any other tool to create interactive ebooks or magazines? May be HTML5 or something with editor itself???
I have used QLPreviewController before to open local PDF files and it works great and it not hard to implement-- and it supports multiple documents.
Heres a good tutorial that I followed:
http://mobiledevelopertips.com/data-file-management/preview-documents-with-qlpreviewcontroller.html
I have not tested viewing videos or anything-- the PDFs I had were simple. But it is an Apple control and has all of the basic PDF viewing functions like pinch to zoom, but like I said my PDFs only had text but it is worth a shot to try.
After a long long research and discussions, unfortunately there is no free solution with these features. There are only paid solutions and Adobe Publishing Suite takes the head although it is one of the most expensive solutions

A technology for reading pdfs online with annotations?

is there an open source solution that displays PDFs for online reading? It has to be searchable much like google books and if possible has the ability to display annotations?
By "online reading" I'll assume you mean without a PDF reader plugin on the client. In that case you'll need to convert to HTML
http://pdftohtml.sourceforge.net/
If you don't mind losing the ability to copy text then converting to PNG may give you a more accurate rendering
http://www.imagemagick.org/
Regardless of the output format you can manage your searching using the original PDF data. One technology for this is mnogosearch
http://www.mnogosearch.org/
Monogosearch uses pdftotext internally, you may find this useful if you want to write your own search routines. pdftotext is part of the Xpdf suite of utilities
http://www.foolabs.com/xpdf/about.html
All of the tools listed above are available on Windows or Linux
You may also be interested in the Vuzit DocuPub Platform: http://vuzit.com/products/docupub_platform
The display technology itself is not open source, but they provide an API to access their service, so perhaps it is worth investigating.
Don't know if you are looking a software to install or some service to pay for...
I've read a lot about www.getbackboard.com (this is not advertising, only reporting something I've read about, that maybe fits your needs.. ;)
Not sure if they do annotations, but both of these will show PDFs quite well:
http://pdfmenot.com
http://docs.google.com
ICEPdf recently released their code as open source. It is Java based.
PyPdf is really nice. It supports reading the text as well as encryption which I know that itextsharp does not.
Of course you'd have to program in python as IronPython's class libraries aren't quite to the point where you can ref them from another language and use them. (But I imagine they will be someday soon)
PyPdf
This is not open source, but check it out anyways. You can download a free trial of their SDK to try it out. Reading PDF's and their annotations is not simple and I wouldn't trust a production app to open source decoders.
Here is an online demo.
http://www.atalasoft.com/ajaxannotations/default.aspx
Another good pdf reader is FoxitReader.

Software/Platform to Share Specs

What are the software/ Wiki you use to write and share your specs about the developers, testers and management?
Do you use Wiki system, and if so, what Wiki software you use?
Or do you use Sharepoint to manage and version the specs? One problem with SharePoint 2003 as specs platform is that it's very hard to collaborate among different people.
For backward compatibility sake, I would also like to have the platform able to import Microsoft Word seamlessly. And it would certainly help if the interface is similar to Microsoft Word.
Any idea?
I've used Confluence at a number of places, it's a pretty powerful wiki and very good for creating specifications that can be shared amongst various parties. See:
http://www.atlassian.com/software/confluence/
There's some more information here on the advantages of using Confluence:
https://stackoverflow.com/questions/170352/confluence-experiences
EDIT: I've updated this to deal with the Microsoft Word import feature you mentioned. Confluence supports this through the Office Connector here:
http://www.atlassian.com/software/confluence/plugins/office-connector.jsp
There's also a Sharepoint connector:
http://www.atlassian.com/software/confluence/plugins/sharepoint-connector.jsp
plus a whole bunch of plugins:
http://www.atlassian.com/software/confluence/plugins/sharepoint-connector.jsp
Some of these are user contributed also. I can't recommend Confluence enough as a commercial wiki.
I've also used JSPWiki, which is open source. it's ok but not as good as confluence, see:
http://www.jspwiki.org/
You could try Google docs - I have successfully used this in the past. It supports import / export to MS Word, and it has great support for multiple user - see http://www.brighthub.com/internet/google/articles/8236.aspx.
It supports versioning, allows you to chat with other people who are currently working on the document, and shows you a list of all the changes others have made to the document (without needing to close / reopen the document).
If you want corporate support, Google also provides that - see Google Apps for business.
We use SharePoint -- it's not ideal, but it does a decent job. If I were you, I would seriously look at getting off SharePoint 2003 and on to MOSS (SharePoint 2007). It's not perfect, but it's substantially better. Here's a little bit on using MOSS as a wiki. I think in general wiki's are a good tool for getting people up to speed on your system. We used to pass around "getting started documents" and now we have all that type of stuff in our developer portal.
Per John's comment, I looked up this feature comparison. I have to go back and look at what features I'm using that are not in WSS -- I might be paying for licenses I don't need! :)
We use email. I know it isn't elaborate, but it is easy to use. Everyone has it installed and there are no licensing issues. All spec changes are sent to an super set email distro indicating the updates and the location on the network share where the spec can be found.
We use Alfresco, in its Community version, from both its Share and Explorer web interfaces.
Quite useful, with a document library, wiki, forum and calendar.
We curently host about 1.8 Go consisting mainly in docs, versionned and sometimes automatically converted to PDF (by creating an automatic content rule).
FTP, WebDav and network share are also used to access to the same repository.
You could take a look at Microsoft Groove - the collaboration software that Microsoft bought a few years back.
It's bundled free with premium versions of Microsoft Office.
You can customize the workspace with discussion boards and can fairly seamlessly store collaboratively-edited Office documents.
We use MediaWiki for dos & specs. Wiki definitely wins anything like Microsoft Word or SharePoint - it allows you to develop a documentation in "first refer, then describe" = "divide and rule" way. Perfect for developers - they used to think the same way. The process of developing a documentation is almost ideal: you start from TOC and drill down until you write the document for every link you put earlier.
MediaWiki is quite customizable - there are lots of extensions there. The most necessary ones are:
Source code highlighter - CSO_Source
Our own templates integrating wiki with class reference.
Others are InterWiki, FileProtocolLinks, YouTube (we use customized version of it to display HD video), ReCaptcha, SpecialDeleteOldRevisions, Maintenance.
Some integration examples are here.
And we use Google issue tracker to track the issues. Its main advantages:
Imput usability: the process of adding\changing the issue is really convenient there. Earlier we tried Track Studio - the same actions require 2-3 times more time there, so it died fast simply because most of us hated to use it.
Customizable grids. See the examples. Really helpful.
Atom\RSS support. So everyone knows what's going on.
There is a Gurtle tool integrating it with TortoiseSVN. Really helpful.
Its main disadvantage is that it can't be closed from the public access. This makes it simply unusable in many cases.
If you want a UI similar to Word, why not use Word with SharePoint 2007? You're on 2003 so the experience is there. Upgrade to SharePoint 2007 and you can have the collaboration, Word features, document sharing, and so on.
This is the kind of thing Microsoft wants people to use Office for, so there's a ton of doco out there about how to configure your SharePoint and Office environment to support collaboration.
There is something that Google do in this direction and it looks really cool: wave.google.com. It would be a great step in collaboration and worth to wait it.
Here we use Google Docs it makes the documents available to everyone write or read only, public or private among people that have or not Google accounts, it also can import Word docs, not to mention that it runs directly into the browser so it has high availability with zero cost and zero setup, also its computer/OS agnostic, we have a nice experience with it.
Also perhaps you should take a look at Basecamp or Backpack at 37Signals, any of then might also fit your bill.
We use DocBook for all of our specifications (and other customer-facing documentation). DocBook is an XML format that lets you easily generate documents in just about any format, including PDF, which is how we distribute things to clients to get them signed off. We can divide a document into files (by section) and commit everything to our source control system (Subversion). Because it is all XML (i.e. text-based), Subversion's automatic merging and conflict resolution works great if two people work on the same file. We have a set of stylesheets that all of our documents use, so all documents share the exact same style/format, with no extra work on our part.
And if you don't like editing XML files directly, there are GUI front-ends that provide a reasonably WYSIWYG-like experience. I believe that most people in my office use XMLMind. Still, we happen to all be technical people so if we had to write XML directly it wouldn't be an issue.
As a sidenote, we also put out release notes. We have some XSLT that lets us write documents like this:
<bugs>
<bug id="1234" component="web">JavaScript error when clicking the Kick Me button</bug>
</bugs>
We then have a script that runs through our Subversion repository doing an svn log from the previous release tag to the current release tag, and some Bugzilla integration to automatically generate release notes on-the-fly.
(also, for most internal-only documentation, we use MediaWiki, which is also a great way to collaborate.)
We use OnTime. It was originally only used for defect tracking, but we've started using it to track features as well. These can be used to document the feature as it evolves during development. Features can be grouped together into sprints or releases, and time can be tracked against each feature. If you are using SCRUM, you can also plot burn-down charts for each sprint. It also has wiki functionality.