Overlaying one PDF over another - pdf

I am working on a new in-house automated artwork workflow system. The new system delivers a stepped up PDF ready for offset print. The artwork is black only with barcodes and variable data text. There might be between 4 and 50 stepped up artworks per SRA3 sheet.
However, the new system is unable to add tick marks, gripper marks and other information around the edge of the sheet that our production teams would like. Our attempts to add these are not always fit for purpose as the of the shelf software used to generate the variable data was intended for thermal printers.
What would be better is if we had the tick marks saved on another series of template PDF's. We could then superimpose these on the automated artwork.
We are talking about 100's of PDF's per day and we would need this process to be a simple and as possible requiring little skill on the part of the operator or even automated/scripted.
I have read a similar post where adding watermark with Acrobat was recommended. This worked a treat for me but considering the high volumes of artwork, even if made into a Acrobat Batch Process/Action, this would be too involved.
Any ideas welcome: Rip Software, AppleScript, MSDOS!!! Whatever!
Cheers
Tim

Related

Is There a Quick, Efficient Way to Add Large Numbers of Labels in Either ArcGIS or QGIS?

In 2007, when I was young and foolish and before I knew about Open Street Map, I started an urban historical map project. I was working in Illustrator, it was going to be an interactive Flash piece, and my process was to draw the maps first, with the thought that I'd label some, but not all, of the street later on.
As we know Flash was began to die about 2010 and I put the project away for a number of years. I picked it up again a couple years ago and continued my earlier practice of just drawing streets and water features, this time with the intention of making it a conventional web map. Now I'm pretty close to finishing the drawing of a five-layer (1871, 1903, 1932, 1952 and 2016) historical map of a medium-sized city, though it still lacks labels.
My problem now is how to add large numbers of labels, many of them duplicates. There could be as many as 10,000 for all five layers, though as a practical matter I may have to settle for a smallish fraction of that number. Based on web searches I gather my workflow is unusual and that mine is therefore an unusual problem.
I've exported my maps and brought them into QGIS and played with the software a little. The process of adding labels to objects doesn't seem terribly efficient or user-friendly, but that's probably due to my unfamiliarity with the program.
So my question is this: Are there any tricks to speed up the painful process of adding large numbers of duplicate labels in either QGIS or ArcGIS? Since so many of the streets exist in all five layers, functionality like the ability to select multiple objects in different layers and edit their attributes simultaneously in the Attribute Table would be a godsend. (Doesn't seem possible.) So would the ability to copy the attributes from one object and paste them onto other objects. Or the ability to do either of these things in Illustrator via a plugin and then export the data along with the shapes to a GIS program.
Thanks for your help!
If I understand the issue correctly I think are several different solutions. When you say that you
Typically for a spatial layer in ArcGIS or QGIS you define how to label all features in a layer once by defining a label scheme to use across all features, 1 or 1 million. This assumes that each feature in the layer has one or more attributes in the associated table for the layer.
How are you converting the Illustrator vectors to a spatial layer? DXF?
You will likely have better/faster responses to this question by posting it to the GIS Stack exchange. https://gis.stackexchange.com/

PDF data extraction

Is there a way for me to take a scanned PDF image and extract data from the image by highlighting the fields that are needed? We scan thousands of PDF images of real estate deeds daily and would like to be able to automate the data entry process. The problem that we are facing is that no two deeds are the same.
It has been said in comments that Stackoverflow is mainly about programming issues.
Nevertheless, there are possibilities, depending on the actual documents, and the volumes to be processed.
On the high end, there is a product called Teleform, originally developed by Cardiff, and now owned by HP, which is used to process paper forms; you may also look at the Business Process application Cardiff LiquidOffice, now HP LiquidOffice.
On the low end, I have developed an application in PDF, running under Acrobat, which can take a scanned and OCRd form, and transfer the data to a specially prepared fillable form, from where the data can be exported towards a database, for example. For more information, a demo and a quote, feel free to contact me in private.
If you want to develop something using Acrobat, you could also begin with a OCRd document, and then use the capabilities of the Redaction function (or use the industrial strength Redaction tool Redax by Appligent) to find keywords, and then use the positional information of those keywords to extract more data.

What methods to recognize sentence handwriting?

I mean posts per sentence, not per letter. Such a doctor's prescription handwriting which hard to read. Not just a normal handwriting.
In example :
I use a data mining or machine learning for doing a training from
paper handwrited.
User scanning a paper with hard to read writing.
The application doing an image processing.
And the output is some sentence from paper.
And what device to use? (Scanner or webcam)
I am newbie. If could i need some example in vb.net with emguCV/openCV and researches journals.
Any help would be appreciated.
Welcome to stack overflow! The answer to your question is twofold:
a. If you want to recognize handwriting that has already happened i.e. it is presented to you as an image you are in trouble. Computer Vision is still not good enough to provide you with reasonable accuracy.
b. If you have a chance to recognize handwriting “as it's happening” - you are in luck. Download, for example, a Gesture Search app from Android play store and you are in business.
The difference between the two scenarios is subtle but significant. In the second case you have an extra piece of information that makes handwriting recognition possible. This piece is timing of each stroke. In other words, instead of an image with handwriting you have a bunch of strokes that are all labeled with their time stamps. You can think about it as a sequence of lines and curves or as image segmentation - in any way this provides a big hint for the system. Additional help comes from the dictionary on your phone but this is typically used by any handwriting system.
Android of course has an open source library for stroke recognition (find more on your own). If you still want to go for recognizing images though, you have to first detect text (e.g. as a bounding box) and second use any of the existing engines to process detected regions. For text detection I can recommend MSER. But be careful trying to implement even text detection on your own - you are entering a world of pain here ;). Here is an article that can help.
As for learning how to recognize text from images on the Internet - this can be your plan B or C or Z when you master above mentioned stages. Don’t try to abuse learning methods and make them do hard work for you - you will hit a wall if you don’t understand what’s going on under the hood.

Do PDFs sizes change depending on the program that opens it?

I work for a company designing t-shirts. We get transfers printed by another company. The transfers we received for a recent design were to small, as I'm guessing they were printed portrait instead of landscape. The representative from the company we order prints from is claiming that it isn't their fault and that...
"Sometimes the images we receive are not the size you send. This is due to the different formats of the files and the way our computers convert them."
He says it's our fault because we didn't send the design dimensions to him along with the design. The file sent was a PDF. Am I correct in understanding that a PDF will always open at it's intended size? I thought the size was embedded within the PDF. I'm fairly certain I'm correct, but don't want to basically call him out for it without knowing I'm correct. Do computers really convert PDFs so that they're a completely different size? That would be a terrible way for PDFs to operate, if that's the case.
The size is definitely embedded in PDF, and the purpose of PDF format is to be used as the final document format before printing. My advice is to send a note with desired dimensions, perhaps it's their machine that resizes the picture.

Small embedded synthesized speech libraries/suggestions

Are there any easy-to-use free or cheap speech synthesis libraries for PIC and/or ARM embedded systems where code size is more important than speech quality? Nowadays it seems that a 1 meg package is considered "compact", but a lot of microcontrollers are smaller than that. Back in the 1980's Apple hired a contractor to produce Macintalk, which offered reasonable-quality speech in a 26K package which ran on a 7.16MHz 68000, and a program called SAM could produce speech that wasn't quite as good, but still serviceable, with a 16K package that ran on a 1MHz 6502. The SpeakJet runs a speech-synthesis algorithm on some type of PIC.
I probably wouldn't particularly need to produce speech, but would want to be able to speak messages formed from a number of pre-set words. Obviously it would be possible to simply prerecord all the messages, but with a vocabulary of e.g. 100 words, I would think that storing 16K worth of code plus maybe 1K worth of phonetic strings would be more compact than storing audio for 100 words.
Alternatively, if I wanted to store audio for 100 words, what would be the best way of generating a set of words that would flow naturally together? On older-style speech synthesizers, any given word could be spoken three ways: neutral inflection, falling inflection (as if followed by a period), or rising inflection (followed by a question mark). Words with neutral inflection could be spliced together in any order and sound fine. The text-to-wave tools I've found, though, seem to like to add finer details of inflection which sound "off" if words are cut apart and resequenced. Are there any tools which are designed for producing waves that can be concatenated and spliced nicely? If I do use such a tool, what audio format would be best for storing the waves so as to allow efficient decoding on a small microcontroller?
Last time I did this I was able add hardware like:http://www.sparkfun.com/products/9578 . There may be patent liabilities in your environment, like I ran into, that force a commercial software stack or OTS chip.
Otherwise, I've used http://www.speech.cs.cmu.edu/flite/ for more lenient projects, and it worked well.