I am trying to draw objects (though scripting) on specific artboards (and layers), but I can't find any example. I thought this part of code would be enough, but objects still appear on the first artboard.
var doc = app.activeDocument;
doc.artboards.setActiveArtboardIndex (1);
Thanks for help
Fransua
Related
I followed the guide at this URL: http://developers.itextpdf.com/content/itext-7-jump-start-tutorial/chapter-6-reusing-existing-pdf-documents
Following that guide, I had a problem where some content from the PDF was not copied into the destination PDF when using copyAsFormXObject (which I submitted a support ticket for). An alternative I found in the meantime was that I could use the PdfDocument's copyPagesTo method and simply open the page that was copied with getPage on the destination PDF. From that, I can create a PdfCanvas from the existing page and do our transformations (such as scaling) on the object.
This seems to work exactly as the code in the aforementioned guide with the exception that the PDFs I found where content wasn't copied, the content now appears to be copied.
Are there any drawbacks to using the copyPagesTo method to copy the content as opposed to what the guide suggests (copyAsFormXObject)? Performance, memory, or extraneous non-visible content, etc.?
Code that exhibits this problem:
PdfDocument pdf = new PdfDocument(new PdfWriter(dest));
PdfDocument origPdf = new PdfDocument(new PdfReader(src));
PdfPage origPage = origPdf.getPage(1);
PdfPage page = pdf.addNewPage();
PdfCanvas canvas = new PdfCanvas(page);
PdfFormXObject pageCopy = origPage.copyAsFormXObject(pdf);
canvas.addXObject(pageCopy, 0, 0);
pdf.close();
origPdf.close();
Code that does not:
PdfDocument pdf = new PdfDocument(new PdfWriter(dest));
PdfDocument origPdf = new PdfDocument(new PdfReader(src));
origPdf.copyPagesTo(1,2,pdf);
pdf.close();
origPdf.close();
I've provided code and answers for the specific problem on your support ticket.
As for the difference between copyToPages() and copyAsFormXObject() for copying pages:
copyToPages() is a high level method that copies over the entire page, maintaining all structure and adding any applicable resources to the new document.
With copyAsFormXObject(), you first need to transform the page to an XObject, essentially turning it into an appearance stream. If this page needs additional settings or resources to be displayed correctly, such as a different page size or fonts that were not stored on the page itself, they need to be manually set or added. XObject are always added at absolute positions, so this needs to be specified too.
While copying using low-level methods such as XObjects grants a lot more control over what the result can look like, they come with their own dangers and pitfalls. For ubiquitous tasks such as copying pages, it is better to use the high-level methods to avoid such possible problems.
EDIT:
We've decided that this behaviour is a bug and that 'copyAsFormXObject()' should include the used resources even if they're stored at the /Pages level. This will be fixed in a later release of iText
when trying to create an Image from a signed PDF Page, the resulting image shows the signatures but the signatures are not displayed correctly.
For example, the original contains two signatures next to each other in the bottom section.
In the resulting image the signatures look like they have been scaled up and are overlapping.
Furthermore, there's a signature in the top right corner. This signature looks scaled up in the resulting image and is cut off to the right. What is happening here? What am I doing wrong? I'm pretty new to working with PDFs on this level.
Hope that makes sense. Please see below for the differences (I've cut out other content).
Here's the code I'm using:
List<PDPage> pages = inputDocument.getDocumentCatalog().getAllPages();
PDPage page = pages.get(0);
BufferedImage image = page.convertToImage(BufferedImage.TYPE_INT_RGB, PDF_RESOLUTION);
String fileName = "converted_image_" + (i + 1);
ImageIOUtil.writeImage(image, "png", fileName, BufferedImage.TYPE_INT_RGB, PDF_RESOLUTION);
here's the original
and now the distorted version
As suggested by Tilman Hausherr, I was using the current 1.8.x stable release which has problems with annotations appearances. This led to the seen behaviour. Testing with the current 2.0 SNAPSHOT solves this problem.
Now we are eagerly awaiting the release of 2.x :)
From what I've seen, they totally reworked how creating images from a PDF(Page) should be done so I'm not sure about the probability of a backport.
Hope that helps for anyone else coming across this.
I am trying to understand the .mesh files, usually generated for mesh visualization with Medit.
The documentation is here, but it is in french.
The thing I understand is that after every line describing and object in the file (vertex, triangle, tetrahedra, etc.) it comes a ref variable, that in the examples files I have, they usually are 0,1,2,3 and I don't understand what is their purpose.
Can somebody please explain this?
You can get an .mesh example here.
Each reference corresponds to a color in Medit. The colors are arbitrary, and can be changed in Medit (using the GUI or changing a configuration file).
The reference values in the Mesh file refers to a color index. Maybe the program uses this to display the vertices, triangles and tetrahedra with certain colors. You can ignore this value for all practical purposes.
i have a set of pdfs, from which i want to process( VB.NET) only those which are non text searchable, can you please tell me how to go about this?
Generally speaking, the way to do this is open up each page and rip the content stream and see if any text operators are executed that place text on the page.
Let me explain what that means - PDF content is a small RPN language that contains operations that mark the page in some way. For example, you might see something like this:
BT 72 400 Td /F0 12 Tf (Throatwarbler Mangrove) Tj ET
Which means:
Begin a text area
Set the position of the text baseline to (72, 400) in PDF units
Set the font to a resource named F0 from the current page's font resource dictionary
Draw the text "Throatwarbler Mangrove"
End a text area
So you can try short cuts
does my page's resource dictionary contain any fonts?
This will fail in some cases because some PDF generation tools put fonts into the resource
dictionary and don't use them (false positive). It will also fail if the page content contains a Form XObject which contains text (false negative).
does my page's content stream have BT/ET opertors?
This will get you closer, but will fail if there is not content in them (false positive) or if they're not present, but there's a Form XObject which contains text (false negative).
So really, the thing to do is to execute the entire page's content stream, including recursing on all XObject to look for text operators.
Now, there's another approach that you can take using my Atalasoft's software (disclaimer, I work for Atalasoft and have written most of the PDF handling code, I also worked on Acrobat versions 1-4). Instead of asking, does this page contain any text, you can ask "does this page contain only a single image?"
bool allPagesImages = true;
using (Document doc = new Document(inputStream))
{
foreach (Page p in doc.Pages)
{
if (!p.SingleImageOnly)
{
allPagesImages = false;
break;
}
}
}
Which will leave allPagesImages with a pretty decent indication that each page is all images, which if you're looking to OCR is the non-searchable documents, is probably what you really want.
The down side is that this will be a very high price for a single predicate, but it also gets you a PDF rasterizer and the ability to extract the images directly out of the file.
Now, I have no doubt that a solid engineer could work their way through the PDF spec and write some code to extend iTextPdfSharp to do this task I think that if I sat down with it, I might be able to write that predicate in a few days, but I already know most of the PDF spec. So it might take you more like two weeks to a month. So your choice.
I think this option could be your consideration, though I haven't tested the code yet but I think it can be done by read the properties for each PDF files that you want to proceed.
You might check this link :
http://www.codeguru.com/columns/vb/manipulating-pdf-files-with-itextsharp-and-vb.net-2012.htm
You have to read the producer properties right after you proceeded it. That's just only example. But my advice please include your code here so we can give a try to help you. Bless you
Consider the following, I have paragraph data being sent to a view which needs to be placed over a background image, which has at the top and the bottom, fixed elements (fig1)
Fig1.
My thought was to split this into 4 labels (Fig1.example2) my question here is how I can get the text to flow through labels 1 - 4 given that label 1,2 & 3 ar of fixed height. I assumed here that label 3 should be populated prior to 4 hence the layout in the attached diagram.
Can someone suggest the best way of doing this with maybe an example?
Thanks
Wish I could help more, but I think I can at least point you in the right direction.
First, your idea seems very possible, but would involve lots of calculations of text size that would be ugly and might not produce ideal results. The way I see it working is a binary search of testing portions of your string with sizeWithFont: until you can get the best guess for what the label will fit into that size and still look "right". Then you have to actually break up the string and track it in pieces... just seems wrong.
In iOS 6 (unfortunately doesn't apply to you right now but I'll post it as a potential benefit to others), you could probably use one UILabel and an NSAttributed string. There would be a couple of options to go with here, (I haven't done it so I'm not sure which would be the best) but it seems that if you could format the page with html, you can initialize the attributed string that way.
From the docs:
You can create an attributed string from HTML data using the initialization methods initWithHTML:documentAttributes: and initWithHTML:baseURL:documentAttributes:. The methods return text attributes defined by the HTML as the attributes of the string. They return document-level attributes defined by the HTML, such as paper and margin sizes, by reference to an NSDictionary object, as described in “RTF Files and Attributed Strings.” The methods translate HTML as well as possible into structures of the Cocoa text system, but the Application Kit does not provide complete, true rendering of arbitrary HTML.
An alternative here would be to just use the available attributes, setting line indents and such according to the image size. I haven't worked with attributed strings at this level, so I the best reference would be the developer videos and the programming guide for NSAttributedString. https://developer.apple.com/library/mac/#documentation/Cocoa/Conceptual/AttributedStrings/AttributedStrings.html#//apple_ref/doc/uid/10000036-BBCCGDBG
For lesser versions of iOS, you'd probably be better off becoming familiar with CoreText. In the end you'll be rewarded with a better looking result, reusability/flexibility, the list goes on. For that, I would start with the CoreText programming guide: https://developer.apple.com/library/mac/#documentation/StringsTextFonts/Conceptual/CoreText_Programming/Introduction/Introduction.html
Maybe someone else can provide some sample code, but I think just looking through the docs will give you less of a headache than trying to calculate 4 labels like that.
EDIT:
I changed the link for CoreText
You have to go with CoreText: create your AttributedString and a CTFramesetter with it.
Then you can get a CTFrame for each of your textboxes and draw it in your graphics context.
https://developer.apple.com/library/mac/#documentation/Carbon/Reference/CTFramesetterRef/Reference/reference.html#//apple_ref/doc/uid/TP40005105
You can also use a UIWebView