Photoshop Script for flattening TIFF - photoshop

I am using Photoshop CS5.5, is there a way to write a javascript that can
For each of TIFF image under a folder with many sub-folders
If the TIFF file has more then one layer
Then flatten the image and save as the original to a PSD?

You can read about opening files on this answer.
Flattening an image is simple:
var docRef = app.activeDocument;
docRef.flatten();
I don't know what you mean by "save as the original", but you can see the classes and options for saving documents on this page. That's a quite useful site, by the way, it lists many other classes and their respective options.

Related

Embedding PDF graphics in PDF output file programmatically

I am looking for a rough overview of how one would go about embedding graphics (coming from a PDF file) into another PDF file when writing a C++ document processor.
Background: I work on the LilyPond music typesetter, and recently added Cairo output to the system. Now I would like to support adding externally provided graphics to the PDF files that we generate (eg. adding a logo onto page laid out). This is trivial with EPS for PS output.
I can see how you could hook up Poppler to read the PDF, and render the PDF contents onto a Cairo surface, but I wonder if there is a simpler shortcut (eg. embed the PDF file as a binary stream, and then point directly to that stream).
If you need to go via an external route, like reading the PDF and writing it into an existing PDF using Cairo, that would be simpler. To do it manually:
A PDF page consists of a stream of operators for drawing it, and a dictionary of external resources (fonts, images etc.). To stamp one PDF page onto another, you would need to:
a) Find all objects for external resources in the stamp which are needed, and add them to the destination PDF.
b) Convert the page to a "Form Xobject", which is a sort of reusable piece of content. Add this to the /XObjects entry in the destination page, making sure to pick a fresh name.
c) Add some operators to the page content in the destination page to invoke the new xobject
To see how this might work, you could play with -stamp-as-xobject and -postpend-content "/XObjName Do" from section 8.4 of the cpdf manual.
Making this work for arbitrary PDFs is really not for the faint of heart, I'm afraid.

Creating a searchable PDF from one already existing PDF and text (with coordinates)

My Situation:
I have an existing PDF with only images
I have a preprocessed OCR with all text identified and their respective coordinates
An application running in C#
I can use other programming languages if needed
My Question:
-> Do you know a way to create a searchable PDF using those existing resources (Images and Text with their coordinates) <-
I'm doing a lot of research but most of the results I get only show how to create a searchable PDF using some library (iText, PDF Sharp, etc.) that uses their built-in OCR engine, that is not my case, I already have the text and coordinates.
Thank you for any help and thought you can provide me.

PDFClown image extraction images inverted

I'm working with PDFClown and I'm trying to extract images from a pdf file. I use the example code provided by the source code that can be found at http://pdfclown.org.
ImageExtractionSample.java.
The problem is the images are negative and flipped horizontally. Does anyone know how to resolve this problem?
Check with other PDF files to see if other PDF files are also giving the rotated or flipped images. ImageExtractionSample.java is not checking rotation or matrix defined transformations for the image object but just writes the content to a file as is (so it will work for JPG images but not for CCIT encoded images for example).
So there are things to consider when you extract image from PDF:
image can be rotated using the attached transformation matrix (CTM);
image can be rotated/transformed as part of the form which is transformed;
image can be placed without transformation on a page but the page itself is rotated;
image may contain the overlaid Mask on top of it (and the Mask can be rotated and transformed);
JPG image is stored pretty much as is but there are other formats supported by PDF like CCIT compression, LZW compressed images etc;
But the general suggestion is that when you extract JPG image from PDF using PDFClown you should just flip and rotate extracted images like suggested on the SourceForge project discussion page.
if you could point to the particular PDF sample file then it would be easier to suggest the solution.
If you're on Windows then you may use this free PDF Multitool utility to compare non-transformed and transformed images from PDF using "Extract raw images (without transformation)" option in images extraction dialog.
Disclaimer: I work for ByteScout, the PDF Multitool utility is free for both commercial and non-commercial purposes.

How to clip and concatenate a page region in multiple pdf files with one page each?

I have a lot of pdf files each one with an image inside. I want to clip a rectangular region in each of these files and concatenate them into a single pdf file. Is it possible with ghostscript or similar?
I'll have a go at this. Try Briss if you want to crop rectangular regions in pdf files. It's free and cross-platform GUI.
If you have multiple pdf files you can concatenate/merge them first online using http://www.pdfmerge.com/ Then use Briss to crop the images out into a new pdf file. Or vice-versa depending on the location of your images inside the pdf files.
After you fire up Briss, load the merged pdf file containing the images. When you're asked if you want to exlude anything, just click "cancel" if you want to include all pages.
If your file has many pages, similar pages may be overlapping each other so you can draw a rectangle over the region you want to crop. Click Action -> Preview for previewing the output. Click Action -> Crop PDF to finalize your output pdf file. Cheers.

Extract layers from PDF file to HTML

I have a PDF file, containing layers.
For example, on some pages, there are graphs, with additional data displayed on top of that graph, when clicking (layers).
Now I need to try to fetch all these layers out of the PDF file, or to be precise, I need ALL the data from that PDF file, including layers. The pdf file contains javascript to show/hide the layers when appropriate.
What is the best approach? Is there any tool that actually works for my intentions? Or should I write something myself? (If this is possible ofcourse).
Edit:
Here you can download the PDF file:
http://www.2shared.com/document/IutUfDfr/OR_erasmus.html
The password for viewing is: erasmus
I do not know if there are any tools per se but if you cannot find those you might do the following:
for each combination of on/off layers that you are interested in walk all pages and collect the content streams. Tokenize those and cut out the content you do not want to see (the commands you need to monitor to determine this are BDC and EMC). Save the stream again with the clipped content (naturally save the result in different files). You need something to read the PDF object structure and update some objects (there are lot's of libraries for that), plus you need tobe able to parse the content streams.
Now you will have a set of PDF files without layers (optional content) for which there are plenty tools to render to HTML etc.
Note: optional content <--> layer switches in the PDF viewer usually are 1:1 but the standard supports a full n:m mapping. I would concentrate on the real optional content blocks that can be turned on/off to keep things simple.
you can use this tool to extract images and text from even locked pdfs
http://download.cnet.com/Able2Extract/3000-2079_4-10249654.html
I use it myself sometimes and it has the ability to convert to HTML