I am looking for an easy solution for the following problem:
I have to create variants of a document and export them as an image. This could be easily done with the MS Word Mail Merge, but I need the pixel positions of every text block in that document. The image as well as the pixel positions are input for an AI training.
At the moment I can think of several approaches:
Throw the MS Word Mail Merge output into an OCR and try to identify the positions of the text blocks by comparing them with the original text source.
Create the document with something like JS, Python or Visual Basic and save the exact positions of each inserted text block at the time of inserting.
Maybe use Visual Basic for Word to extract the text positions from the MS Word XML file that was created with the Mail Merge function.
Variant 1 seems to be overly complicated because it uses some kind of reverse engineering. Additionally, using an OCR even on a perfectly readible document can always be a source of error.
So variants 2 or 3 seem fine, but I don't know any libraries that fit the requirements and Visual Basic for Word is absolutely new territory for me.
I hope I described the problem well enough. If you want me to clarify something, please let me know.
I appreciate every idea and help! :)
Best Regards
Henrik
Seems like someone already dislikes my post. Please let me know how I can improve before voting me down..
Anyway, I may have found a way to realize variant 2. This stackoverflow post references a Github Gist that extends the Python Image Library. It offers a function to write text on an image and also set a maximum width for the text box. The function also returns the final width and height of the drawn text box. Using this I will try to implement an algorithm that creates the document images as well as the label files.
Maybe this will also help someone else looking for the same thing.
I been looking all around for a solution to my issue but i can't find a fix yet. Here's my problem:
I have a dynamic PDF which contains a table and several text fields per row that grow vertically as the user adds text (multiple lines and expand to fit vertically). The table properly breaks when the content doesn't fit in the current page, however, I found out that in some scenarios, with a certain amount of characters, the row sometimes overlaps the content in the next page (See below).
By by adding more text to the offending line, the content in that row properly breaks to the next page (See below)
I am not sure whether or not this is a known issue with the tool or I am missing some sort of configuration/setting in the template. I haven't found anything online or in the Adobe Documentation. Any thoughts?
I am using:
Adobe Acrobat Pro 9
Adobe LiveCycle Designer ES 8.2
The form version of the PDF runs in Adobe Reader 7.0.5 (For compatibility purposes with one of the tools our clients are using)
Thanks in advance
After a long time looking for a solution, I found a single blog of someone who had the same issue, which by the way Adobe was kind enough to not document it... Anyhow, the following two processing instructions need to be added to the XML
<?layout allowDissonantSplits 1?>
<?layout allowJaggedRowSplits 1?>
My XML looks like this:
<template xmlns="http://www.xfa.org/schema/xfa-template/2.4/">
<?formServer defaultPDFRenderFormat acrobat7.0.5dynamic?>
<?formServer allowRenderCaching 0?>
<?formServer formModel both?>
<?layout allowDissonantSplits 1?>
<?layout allowJaggedRowSplits 1?>
The author specifies that the directives should only be added if this problem occurs. I wonder why these instructions should only be used in this situation. The full blog post can be found here:
http://blogs.adobe.com/dmcmahon/2011/10/10/lc-forms-es-text-overlapping-on-page-break-using-nested-subforms/
Hope this saves time to someone else
I have been searching for this in many places. Tried PDFSam but not working for me in this situation. I would like to extract pages without comments or sticky notes or pencil mark in Acrobat as a separate pdf to check why these pages were not commented on. I am not a coder, but I have a little Javascript knowledge and I have never written a JS code for Acrobat. Kindly guide me in the right direction to write this javascript code.
Thank you for your help!
an easy way to get around this is, you can extract the pages you want. And then delete all the comments.
This 2-step way helps solve your problem.
Note that you don't want to delete the comments one by one.
You click the comments button usually sitting on the lower left corner, which will show all the comments on the left pane. Click any one of it, and hit Ctrl+A and then hit Delete key on your key board. Save and you are done.
It saves you the pain you may get from writing a JS code.
Hope this helps!
I'm trying to make an iPhone application which can read PDFs in full screen and follow links on PDFs, but I can't find the right way to do it.
First, I tried to use an UIWebView to read the PDF file, but it doesn't work exactly as I wanted (I was not able to fix the link problem).
The second solution was to use the Quartz API to read the PDF. I took a look at http://developer.apple.com/iphone/library/documentation/GraphicsImaging/Conceptual/drawingwithquartz2d/dq_pdf/dq_pdf.html, but i'm only able to print one page on the screen, and there is no way to jump to next pages.
Can someone help me ? I'm running out ideas :)
Thanks for your help
Quartz does not know how to change pages, click on links, etc. It only "draws" the PDF page you want - you will have to do all gestures by yourself (or use UIScrollView and such).
BTW, I guess your question is duplicated. Take a look at: Fast and Lean PDF Viewer for iPhone / iPad / iOs - tips and hints?
I created a simple report and uploaded it to my report server. It looks correct on the report server, but when I set up an email subscription, the report is much narrower than it is supposed to be.
Here is what the report looks like in the designer. It looks similar when I view it on the report server: [http://img58.imageshack.us/img58/4893/designqj3.png]
Here is what the email looks like: [http://img58.imageshack.us/img58/9297/emailmy8.png]
Does anyone know why this is happening?
This issue is fixed in SQL Server 2005 SP3 (it is part of cumulitive update package build 3161)
Problem issue described below.
http://support.microsoft.com/kb/935399
Basically Full Outlook 2007 Client Uses MS Word HTML Rendering Engine (Which Makes Web Archive Report Looked Jacked Up).
NOTE: Web Outlook 2007 Client Uses IE HTML Rendering Engine (Which makes Web Archive Report Look Okay).
We have installed the patch on DB housing Reporting Services and it does fix the issue. Emails look all nice and fancy now.
I notice that the screenshots show Outlook 2007. Perhaps you're not aware that Microsoft somewhat hobbled the HTML capabilities of Outlook in 2007, and now it uses the Word HTML engine, and not the more advanced Internet Explorer one? Might this explain the lacklustre appearance?
http://www.sitepoint.com/blogs/2007/01/10/microsoft-breaks-html-email-rendering-in-outlook/
I got around this problem by doing the
following:
Add a Page Header to the report
Add a line to the page header. Set the width of the line to the
desired page width.
Set the line colour to white (eg to hide the line)
Hope this helps someone else,
Following on from girlC0d3r's solution, images aren't always guaranteed to be shown in an email.
A better solution to widening the report to prevent the content from wrapping is to have a long unbroken string of characters with no whitespace.
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
By giving the text the same color as the background of the email (e.g. white) they'll widen the report and be invisible to the user.
I don't see anything but my first guess is that the fonts are vastly different. The designer has one font and the email is a flat, no-frills kind of thing with a simple font. Without concrete examples, this is just a guess.
I don't think it's a font thing, because the text is being wrapped a lot, and it looks about the same size.
The images show in my preview, but not in the final post. So, here are links to them.
Report in the designer: [http://img58.imageshack.us/img58/4893/designqj3.png]
Email result: [http://img58.imageshack.us/img58/9297/emailmy8.png]
What report output format did you specify for the scheduled job? It seems to me you used HTML, which will autoscale depending on the output browser (HTML adapts).
If having the same layout is important then use PDF as the output format. Then, if the user wants to print the report you know exactly what it will look like and that it will fit nicely on the page.
Can you try a different format? pdf or xls maybe. In my experience web archive looks goofy. Don't know why.
Yeah, I'm using HTML. I would prefer to stick with that, because the users can just read it in their mail clients. PDF or XLS would require them to open an attachment.
I know that the HTML resizes itself to fit the browser, and that's a good thing. The problem I would like to fix is the wasted space - in the email client, the HTML shrinks too much.
I got around this problem by doing the following:
Add a Page Header to the report
Add a line to the page header. Set the width of the line to the desired page width.
Set the line colour to white (eg to hide the line)
Hope this helps someone else,
girlC0d3r is along the right lines (no pun intended), but the line will likely be shrunk along with the rest of the HTML in the email. A workaround I used yesterday was to create an image 1px high by 600px wide (or whatever), the same color as the background, and bring it into the report as an embedded image. Place it above or below the body of your report. This should force the intended width in the final email. I used this technique successfully in a report yesterday.
I just ran into this issue myself, exactly as portrayed in the OP's screenshots. The reports were beautifully rendered in nearly every format except for Web Archive. My trouble was the use of a rectangle containing each matrix that did not span the width of the report. Upon stretching it out through the remaining white space, the condensing behavior ceased. Hope that helps someone who doesn't have quick access to an SP upgrade!
Where it is not an issue of running on old software that needs a patch...
The reason is the columns are different sizes is because the MHTML Device Information Settings, 'OutlookCompat' is set to true
When creating an email subscription with MHTML format and open the report in Outlook, A forum post by Microsoft employee Fanny Liu says
change the OutlookCompat configuration setting for the MHTML Rendering extension in rsreportserver.config. Set the value to: False.
As I was researching it appeared that this would impact more than just column size. In my instance it was not that big of deal so I decided to leave well enough alone. It is correct in PDF and web, the email I send includes a link back to the report, if the client wants a pretty report they are going to want it in PDF, the email format is not expected to be printable.
Encountered the same issue and this worked for me.
Go to --> Properties --> Report
Set InteractiveSize Width to 4.9in
Set Margins to 0 for Left, Right, Top, and Bottom
Set pageSize to Width to 4.9in