I am using ABCPDF.net to generate my PDFs from HTML files. There are 3 images on the HTML page which; of which 2 images are generated but the third is one is not on the PDF. I have tried to move the logo around but without any luck. This is my first time using this product which I am impressed with but this one problem is throwing me off.
From memory there could be one or two things causing this.
1) If your image happens to cross over the boundary of the page even by 1 pixel it will not be rendered.
2) You have to make sure that all of your elements are loaded in a procedural fashion by which I mean that whatever you want to be displayed on the top must be added to the page last. If you don't do this you may find that a rectangle you declared to contain some text for example will be displayed over the top of your image which will obscure it.
3) Sounds daft but just double check all syntax and the images location and ensure that you have referenced it correctly.
If all else fails could you provide some source code or a more in depth explanation.
EDIT: Also double check that you have set the correct size on the image when importing it as you may find it reverts back to the actual dimensions that it is on the disk which could make it too big for the page thus not rendering it.
Related
Disclaimer:
I am using iText 5. I know this is generally frowned upon (vs. using iText 7), but I am working with considerable legacy code that uses iText 5, and upgrading does not fall under my control.
Requirements:
A "simple" PDF/A is received as input (text only, these are generated from RTF), as well as a float value corresponding to a desired first page length in inches.
A PDF/A must be output that is identical to the input PDF, except it is paginated as follows: first page length = input value; each subsequent (not first or last) page will fill a standard page length; the last page will be truncated a constant number of points below the content nearest the bottom of the page. Note that input and output width will be identical and constant.
Progress / Approach:
I have extended the SimpleTextExtractionStrategy to generate XML containing font information (size and family, bold or italics, etc.) as well as location information (relative an absolute coordinate system where the origin is at the top left corner of the first page of the input PDF) for each "span" of text extracted from the input PDF.
I then generate a new PDF page by page (where each page is the desired length according to the requirements outlined above), filtering the extracted XML info with LINQ based on the bounds of each new page, and adding appropriately formatted text at the appropriate location using ColumnText.ShowTextAligned(...).
Problem:
The approach outlined above does fine. It generates PDFs with the desired page structure, but some information is lost in translation, namely colored text and underlined text. While colored text shouldn't be seen in these PDFs, underlined text absolutely must be detected.
This set of requirements should also include PDFs with tables. I originally planned on implementing a different module that adheres to the same interface for table PDFs, as these are generated and used separately from the PDFs generated from RTF, and iText has relatively strong table functionality built in.
The two concerns outlined above, coupled with the fact that my described approach was born out of an attempt to reuse existing code leads me to believe that an entirely different approach may be necessary or at least much better. It seems to me that there should be a way to capture content byte info and clip it as necessary to "re-paginate" the input PDF, only worrying about moving content that falls along a page boundary.
Essentially, I am looking for (iText based) recommendations for a better approach. Pseudo-code type answers or simply recommendations for classes / interfaces that may help are acceptable. While it would be nice to handle text and tables together, any advice pertinent to one or the other would also be appreciated. I have perused much of the available documentation on the iText website and other SO questions, but have not found quite what I'm looking for.
Note that no code is included in this question as I am looking for a high-level approach that is entirely different from what I have tried.
Edit:
I didn't notice it before, but the way in which I was reusing fonts (similar to this) resulted in some unexpected (but documented as such) behavior. It seems that I will need to avoid extracting information for re pagination at the text level, as it will be difficult to ensure continuity of fonts between input and output.
I solved this problem a while ago, but figured I would post my solution. I'm sure it's not the most efficient solution, but it works well for my purposes. Note that this will re-paginate a PDF as described in the question containing text only. Table PDF's are handled separately.
The basic process is this:
Use a custom TextExtractionStrategy to extract XML containing information regarding ascent and descent lines for all text in the input PDF, as well as what page it originally appears on.
Given the page length requirements as described in the question (first page = input value, subsequent = standard length, last page = fit content) and the XML info regarding text positions, determine what content will fit on each page of the output PDF. Create a map of where each input page will need to be cropped (top and bottom, note that each input page may be cropped more than once), as well as a map of which cropped pages will need to be "concatenated" together in the final output.
Copy the input PDF page by page to an intermediate temporary PDF (using PdfCopier). If an input page must be cropped more than once (ex: first 2 inches of input page 1 = page 1 output, next 6 inches of input page 1 = page 2 output, final 0.5 inch of input page 1 = top of page 3 output), ensure that it is copied the appropriate number of times (1 time per crop).
Crop each page of the intermediate copied PDF appropriately. This is done by modifying the MediaBox and / or CropBox.
Concatenate the appropriate cropped pages together into the final output PDF's pages. I used a PdfWriter to first create a new page of the appropriate height, then add each appropriate cropped page at the appropriate position in the output PDF page's byte content usingcontentByte.AddTemplate(inputCroppedPage, 0, bottomOfLastAddedCroppedPage).
To anyone who managed to read and understand all of that, congratulations. To anyone else, please let me know what you if you are confused. The solution described above is a little twisted and tough to put into words. While there is too much code to post here (and I am not at liberty to share the code on GitHub or similar), I would be happy to answer any questions that will help someone else implement something similar.
The TextExtractionStrategy mentioned in step 1 was inspired by this answer. Essentially, I used System.Xml.Linq to create an XML document rather than concatentating strings to form HTML, and I ignored any font information, storing only information regarding where text is located in the page (you'll see that this information is available in the linked answer, just isn't written into the final HTML).
The number of pages displayed when viewed in ReportViewer and in exported PDF are differing.
Eg: 50 records are shown in one single page of Report Viewer. But when Exported to PDF 45 reords come in page 1 and the remaining come in page 2.
Soution Tried:
1)Removed Top and Bottom Margins.
2)Reduced "Interactive Page Size" to match the page count.
But it is not consistent, as it is behaving differently with different number of records.
Can anyone tell me how should I proceed to achieve sync between the ReportViewer and exported PDF ?
Thanks
Short answer - you can't do what you are trying to do: the different renderers handle pagination differently, but appropriately for their output.
The HTML renderer is optimised for screen-based reading and generally allows more content per page than the print renderer does as the print renderer is constrained by the paper size that it formats to. Thus the HTML renderer allows more content on fewer pages for a better browser experience whereas the print and PDF renderers have to conform strictly to the page length.
The best illustration of this is the Excel renderer - the Excel renderer renders the entire report onto a single worksheet in most cases (for reports with grouping and page breaks set on the group footer it will render each group on its own worksheet). You wouldn't want the Excel renderer to artificially create worksheets to try to "paginate" your report or to put it all in one worksheet but insert the header into the spreadsheet rows every "page". It does the appropriate thing which is to include all the data in one big worksheet even though that may be logically thought of as one big "page".
The HTML renderer page length is determined (more accurately, influenced) by the InteractiveHeight attribute of the report (in the InteractiveSize property in the Properties pane for the report). However, the interactive height is an approximation rather than a fixed page break setting and your page breaks may still not conform to the print version even if you set InteractiveHeight to the same length as your target page length. This is because the HTML renderer will vary the page length to group the data together better so the interactive page breaks happen around about, but not always exactly, where the interactive height is set.
This is what is happening in your scenario where the report viewer shows 50 records on one page but the PDF has 45 on the first page and 5 on the second page. The report viewer is making the decision that since there are only a few records left to display it will just include them all on the one page rather than force the user to scroll even though the interactive height will be exceeded. Thus you get a better user experience but a variance in pages between renderers. The important thing about the report is the data and the experience with working with that data in that renderer, not that the pages are the same length no matter how you look at it.
See this discussion of rendering behaviour for more information on why what you are trying to achieve isn't achievable. Just educate your users that the browser pagination is optimised for their viewing pleasure.
What I am ultimately trying to do is to create a grid of images for print that are minor variations of the same thing (different text is all). Looking through online resources I was able to create a script that changes the text and exports all of the images necessary (several hundred). What I am trying to do now is to import all of these images into a new photoshop document and lay them all out in a grid and I can't seem to find any examples of this.
Can anyone point me in the right direction to place a file at a specific coordinate (I'm using CS5 and have the design suite so if there is a way in illustrator to do this quickly...)?
Also, I'm open to other ideas on how to do this (even other programs) easily. It's for labels so the positioning on the sheet has to be pretty precise...
The art layer object has a translate() method that takes delta x and y params. You'll need to open each image, copy it to the target document, get its current location (using artLayer.bounds) and do the math to find the deltas to position it where you want it. Your deltas can be in pixels so you'll get plenty of precision.
Check out your 'JavaScript Scripting Reference' pdf in your Adobe install directory for more details.
Ok I'm marking Anna's response as the answer because though I didn't fully test it, it seems like it should work and answers the original question with jsx. However I'm also leaving my final solution in case anyone else runs across this with the same issue and may prefer this method as well.
What I ended up doing instead is using InDesign. I figured out that it has a grid option that lets you import a number of files and place them all in an equal grid in a single command. This is almost exactly what I was looking for, except that it leaves a small border/margin in between the columns and grids and mine were designed to meet exactly.
I couldn't figure out how to make it not have the border (I have very little experience with InDesign, it may be possible). However I was able to select all my images and scale them uniformly to be the correct size, then I just selected each column and dragged it over to snap to the adjacent column and the same with rows...
I'm writing some logic to build a large single PDF file that our users can print at their convenience. I'm using Java's iText library (through Clojure's clj-pdf).
I'm trying to have the PDF show the same exact template form on every single page, however I can't seem to find any documentation or indication that one can have PDF content "fit to a page".
The text in these forms varies a little bit, so there's a chance it might require more of fewer text lines per page. This means that the content has a chance of spilling over to the next page, or being too short, making the next page creep up into the previous one, breaking the requirement of "one form per page" for the rest of the document.
I'm trying to figure out if my option is pretty much only to manually check the length of the text on each page and potentially crop it by hand if I goes over n lines, or if the PDF format somehow supports a smart way of having paragraphs+tables+headings all fit in one page. Some UI systems allow you to control how spill-over is handled, anywhere from cropping to resizing the font, so I'm curious if PDF supports anything of that sort.
Edit: ended up going with pagebreaks for simplicity, wasn't aware of that option when I wrote this question.
If you want to take control over the space taken by text, for instance to fit it on a single page, the way to go would be to create a ColumnText object and to add the content in simulation mode. If the text fits the page, add it for real. If it doesn't, use a smaller font size. This is demonstrated in the MovieAds example where snippets of text are fitted into AcroForm fields.
I have a two-page SSRS report. When I exported it to PDF it was taking 4 pages due to its width, where the 2nd and 4th pages were displaying one of my fields from the table. I tried to set the layout size in report properties as width=18in and height =8.5in.
It gave me the whole table in a single page of PDF, but I am still getting the 2nd and 4th pages blank.
Is the way I am doing it incorrect? How else can I get rid of those blank pages?
In BIDS or SSDT-BI, do the following:
Click on Report > Report Properties > Layout tab (Page Setup tab in SSDT-BI)
Make a note of the values for Page width, Left margin, Right margin
Close and go back to the design surface
In the Properties window, select Body
Click the + symbol to expand the Size node
Make a note of the value for Width
To render in PDF correctly Body Width + Left margin + Right margin must be less than or equal to Page width. When you see blank pages being rendered it is almost always because the body width plus margins is greater than the page width.
Remember: (Body Width + Left margin + Right margin) <= (Page width)
Another thing to try is to set the report property called ConsumeContainerWhitespace to True (the default is false). That's how it got resolved for me.
After hours of struggling with this problem, I stumbled upon a solution that worked for me:
In SSDT (2012), I had originally had my Page Setup/Page units set to Centimeters. When I changed this to Inches, strangely enough, I was able to export my report to PDF without having every other page be blank.
It is better to do this on the design surface (Visual Studio 2012 is shown but can be done in other versions) first before calculating any maths when editing an SSRS document.
Below the following numbers in red circles that map to these following steps:
In the design surface, sometimes the editor will create a page which is larger than the actual controls; hence the ghost area being printed.
Resize to the controls. Visually look at the width/height and see if you can't bring in the page on the design surface to size it to the space actually needed by the controls and no more.
Then try to create a PDF and see if that fixes it.
If #3 does not resolve the issue, then there are controls requiring too much of the actual page size and going over in either length/width. So one will need to make the size of the controls smaller to accommodate a smaller page size.
Also in some circumstances one can just change a property of the report page by setting ConsumeContainerWhitespace to true to automatically consume the spaces.
The problem for me was that SSRS purposely treats your white space as if you intend it be honored:
As well as white space, make sure there is no right margin.
If the pages are blank coming from SSRS, you need to tweak your report layout. This will be far more efficient than running the output through and post process to repair the side effects of a layout problem.
SSRS is very finicky when it comes to pushing the boundaries of the margins. It is easy to accidentally widen/lengthen the report just by adjusting text box or other control on the report. Check the width and height property of the report surface carefully and squeeze them as much as possible. Watch out for large headers and footers.
I have worked with SSRS for over 10 years and the answers above are the go to answers. BUT. If nothing works, and you are completely stuffed....remove items from the report until the problem goes away. Once you have identified which row or report item is causing the problem, put it inside a rectangle container. That's it. Has helped us many many times! Extra pages are mostly caused by report items flowing over the right margin. When all else fails, putting things inside a rectangle or an empty rectangle to the right of an item, can stop this from happening. Good luck out there!
In addition to the margins, the most common issue by far, I have also seen two additional possibilities:
Using + to concatenate text. You should use & instead.
Text overflowing the width of the specified textbox. So if your textbox only holds 30 characters and you try to cram 300 in there, you might end up with extra pages.
Have you tried to see if there is any white space on the right of your report? If so you can drag it back to the end of your report and then drag the report background back to the same spot.
On the properties tab of the report (myReport.rdlc), change the "Keep Together" attribute to False. I've been struggling with this issue for a while and this seems to have solved my issue.
I recently inherited a report that I needed to make a few changes. After following all the recommendations above, it did not help. The report historically had this extra page, and nobody could figure out why.
I right clicked on the tablix and selected properties. There was a checkbox checked that said add a page break after. After removing this, it prints on one page now.
I fixed this issue by doing the following. ( Using the latest version of Report Builder )
Step 1.) Go to View Tab
Step 2.) Check the Properties checkbox
Step 3.) Click inside the body of your report (it will update values in properties tab)
Step 4.) Take not of the width here
Step 5.) Right click in the gray area outside the report and click report properties
Step 6.) Add your left + right margin to your body width ( if that equals 10 then make your width 11)
Step 7.) Save
If your report includes a subreport, the width of the subreport could push the boundaries of the body if subreport and hierarchy are allowed to grow.
I had a similar problem arise with a subreport that could be placed in a cell (spanning 2 columns). It looked like the span could contain it in the designer and it rendered fine in a winform or a browser and, originally, it could generate printer output (or pdf file) without spilling over onto excess pages.
Then, after changing some other column widths (and without exceeding the body width plus margins), the winform and browser renderings looked still looked fine but when the output (printer or pdf) was generated, it grew past the margins and wrote the right side of each page as a 2nd (4th, etc.) page. I could eliminate my problem by increasing colspan where the subreport was placed.
Whether or not you're using subreports, if you have page spillover and your body design fits within the margins of the page, look for something allowed to grow that pushes the width of the body out.
Make sure the designer in visual studio is not going beyond your max width. Hover over the right page border and drag to the left to make sure the page does not go over your desired layout.
I just reduced all elements Width shorter than 8 inch and it is being corrected,
I did that with mouse,
your report Body should be shorter than 8 inch.
I've successfully used pdftk to remove pages I didn't want/need in pdfs. You can download the program here
You might try something like the following. Taken from here under examples
Remove 'page 13' from in1.pdf to create out1.pdf
pdftk in.pdf cat 1-12 14-end output out1.pdf
or:
pdftk A=in1.pdf cat A1-12 A14-end output out1.pdf