How to set background image in PdfPCell in iText? - pdf

I am currently using iText to generate PDF reports. I want to set a medium size image as a background in PdfPCell instead of using background color. Is this possible?

You can find an example on how to do this with iText 5.5.1 here. You need to create your own implementation of the PdfPCellEvent interface, for instance:
class ImageBackgroundEvent implements PdfPCellEvent {
protected Image image;
public ImageBackgroundEvent(Image image) {
this.image = image;
}
public void cellLayout(PdfPCell cell, Rectangle position,
PdfContentByte[] canvases) {
try {
PdfContentByte cb = canvases[PdfPTable.BACKGROUNDCANVAS];
image.scaleAbsolute(position);
image.setAbsolutePosition(position.getLeft(), position.getBottom());
cb.addImage(image);
} catch (DocumentException e) {
throw new ExceptionConverter(e);
}
}
Then you need to create an instance of this event and declare it to the cell that needs this background:
Image image = Image.getInstance(IMG1);
cell.setCellEvent(new ImageBackgroundEvent(image));
This code was tested with the most recent version of iText and the result looks like this. You're using a version of iText with my name (Lowagie) in the package names (com.lowagie). This means that this sample may or may not work. We don't know and we won't test as the version you're using has been declared EOL years ago. It is no longer supported.

Related

How to rotate a specific text in a pdf which is already present in the pdf using itext or pdfbox?

I know we can insert text into pdf with rotation using itext. But I want to rotate the text which is already present in the pdf.
Before.pdf
After.pdf
First of all, in your question you only talk about how to rotate a specific text but in your example you additionally rotate a red rectangle. This answer focuses on rotating text. The process of guessing which graphics might be related to the text and, therefore, probably should be rotated along, is a topic in its own right.
You also mention you are looking for a solution using itext or pdfbox and used the tags itext, pdfbox, and itext7. For this answer I chose iText 7.
You did not explain what kind of text pieces you want to rotate but offered a representative example PDF. In that example I saw that the text to rotate was drawn using a single text showing instruction which is the only such instruction in the encompassing text object in the page content stream. To keep the code in the answer simple, therefore, I can assume the text to rotate is drawn in a consecutive sequence of text showing instructions in a text object in the page content stream framed by instructions that are not text showing ones. This is a generalization of your case.
Furthermore, you did not mention the center of rotation. Based on your example files I assume it to be approximately the start of the base line of the text to rotate.
A Simple Implementation
When editing PDF content streams it is helpful to know the current graphics state at each instruction, e.g. to properly recognize the text drawn by a text showing operation one needs to know the current font to map the character codes to Unicode characters. The text extraction framework in iText already contains code to follow the graphics state. Thus, in this answer a base PdfCanvasEditor class has been developed on top of the text extraction framework.
We can base a solution for the task at hand on that class after a small extension; that class originally sets the text extraction event listener to a dummy implementation but here we'll need a custom one. So we need to add an additional constructor that accepts such a custom event listener as parameter:
public PdfCanvasEditor(IEventListener listener)
{
super(listener);
}
(Additional PdfCanvasEditor constructor)
Based on this extended PdfCanvasEditor we can implement the task by inspecting the existing page content stream instruction by instruction. For a sequence of consecutive text showing instructions we retrieve the text matrix before and after the sequence, and if the text drawn by the sequence turns out to be the text to rotate, we insert an instruction before that sequence setting the initial text matrix to a rotated version of itself and another one after that sequence setting the text matrix back to what it was there originally.
Our implementation LimitedTextRotater accepts a Matrix representing the desired rotation and a Predicate matching the string to rotate.
public class LimitedTextRotater extends PdfCanvasEditor {
public LimitedTextRotater(Matrix rotation, Predicate<String> textMatcher) {
super(new TextRetrievingListener());
((TextRetrievingListener)getEventListener()).limitedTextRotater = this;
this.rotation = rotation;
this.textMatcher = textMatcher;
}
#Override
protected void write(PdfCanvasProcessor processor, PdfLiteral operator, List<PdfObject> operands) {
String operatorString = operator.toString();
if (TEXT_SHOWING_OPERATORS.contains(operatorString)) {
recentTextOperations.add(new ArrayList<>(operands));
} else {
if (!recentTextOperations.isEmpty()) {
boolean rotate = textMatcher.test(text.toString());
if (rotate)
writeSetTextMatrix(processor, rotation.multiply(initialTextMatrix));
for (List<PdfObject> recentOperation : recentTextOperations) {
super.write(processor, (PdfLiteral) recentOperation.get(recentOperation.size() - 1), recentOperation);
}
if (rotate)
writeSetTextMatrix(processor, finalTextMatrix);
recentTextOperations.clear();
text.setLength(0);
initialTextMatrix = null;
}
super.write(processor, operator, operands);
}
}
void writeSetTextMatrix(PdfCanvasProcessor processor, Matrix textMatrix) {
PdfLiteral operator = new PdfLiteral("Tm\n");
List<PdfObject> operands = new ArrayList<>();
operands.add(new PdfNumber(textMatrix.get(Matrix.I11)));
operands.add(new PdfNumber(textMatrix.get(Matrix.I12)));
operands.add(new PdfNumber(textMatrix.get(Matrix.I21)));
operands.add(new PdfNumber(textMatrix.get(Matrix.I22)));
operands.add(new PdfNumber(textMatrix.get(Matrix.I31)));
operands.add(new PdfNumber(textMatrix.get(Matrix.I32)));
operands.add(operator);
super.write(processor, operator, operands);
}
void eventOccurred(TextRenderInfo textRenderInfo) {
Matrix textMatrix = textRenderInfo.getTextMatrix();
if (initialTextMatrix == null)
initialTextMatrix = textMatrix;
finalTextMatrix = new Matrix(textRenderInfo.getUnscaledWidth(), 0).multiply(textMatrix);
text.append(textRenderInfo.getText());
}
static class TextRetrievingListener implements IEventListener {
#Override
public void eventOccurred(IEventData data, EventType type) {
if (data instanceof TextRenderInfo) {
limitedTextRotater.eventOccurred((TextRenderInfo) data);
}
}
#Override
public Set<EventType> getSupportedEvents() {
return null;
}
LimitedTextRotater limitedTextRotater;
}
final static List<String> TEXT_SHOWING_OPERATORS = Arrays.asList("Tj", "'", "\"", "TJ");
final Matrix rotation;
final Predicate<String> textMatcher;
final List<List<PdfObject>> recentTextOperations = new ArrayList<>();
final StringBuilder text = new StringBuilder();
Matrix initialTextMatrix = null;
Matrix finalTextMatrix = null;
}
(LimitedTextRotater)
You can apply it to a document like this:
try ( PdfReader pdfReader = new PdfReader(...);
PdfWriter pdfWriter = new PdfWriter(...);
PdfDocument pdfDocument = new PdfDocument(pdfReader, pdfWriter) )
{
PdfCanvasEditor editor = new LimitedTextRotater(new Matrix(0, -1, 1, 0, 0, 0), text -> true);
for (int i = 1; i <= pdfDocument.getNumberOfPages(); i++){
editor.editPage(pdfDocument, i);
}
}
(RotateText test testBeforeAkhilNagaSai)
The Predicate used here is text -> true which matches any text. In case of your example PDF that is ok as the text to rotate is the only text. In general you might want a more specific check, e.g. text -> text.equals("The text to be rotated"). In general try not to be too specific, though, as the extracted text might slightly deviate from expectations, e.g. by extra spaces.
The result:
As you can see the text is rotated. In contrast to your After.pdf, though, the red rectangle is not rotated. The reason is - as already mentioned at the start - that that rectangle in no way is part of the text.
Some Ideas
First of all, there are ports of the PdfCanvasEditor to iText 5 (the PdfContentStreamEditor in this answer) and PDFBox (the PdfContentStreamEditor in this answer). Thus, if you eventually prefer to switch to either of these PDF libraries, you can create equivalent implementations.
Then, if the assumption that the text to rotate is drawn in a consecutive sequence of text showing instructions in a text object in the page content stream framed by instructions that are not text showing ones does not hold for you, you can generalize the implementation here somewhat. Have a look at the SimpleTextRemover in this answer for inspiration which is based on the PdfContentStreamEditor for iText 5. Here also texts that start somewhere in one text showing instruction and end somewhere in another one are processed which requires some more detailed data keeping and splitting of existing text drawing instructions.
Also, if you want to rotate graphics along with the text that a human viewer might consider associated with it (like the red rectangle in your example file), you can try and extend the example accordingly, e.g. by also extracting the coordinates of the rotated text and in a second run trying to guess which graphics around those coordinates are related and rotating the graphics along. This is not trivial, though.
Finally note that the Matrix provided in the constructor is not limited to rotations, it can represent an arbitrary affine transformation. So instead of rotating text you can also move it or scale it or skew it, ...

itext html to pdf without embedding fonts

I'm following this guide in Chapter 6 of iText 7: Converting HTML to PDF with pdfHTML on adding extra fonts:
public static final String FONT = "src/main/resources/fonts/cardo/Cardo-Regular.ttf";
public void createPdf(String src, String font, String dest) throws IOException {
ConverterProperties properties = new ConverterProperties();
FontProvider fontProvider = new DefaultFontProvider(false, false, false);
FontProgram fontProgram = FontProgramFactory.createFont(font);
fontProvider.addFont(fontProgram, "Winansi");
properties.setFontProvider(fontProvider);
HtmlConverter.convertToPdf(new File(src), new File(dest), properties);
}
While it's working as expected and embedding subsets of the fonts being used, I'm wondering if there is a way for the resulting PDF document to not embed the fonts at all. This is possible when creating BaseFont instances and setting the embedded property to false and using them to build various PDF building blocks. What I'm looking for is this same behavior when using the HtmlConverter.convertToPdf().
What you should normally do is override FontProvider:
FontProvider fontProvider = new DefaultFontProvider(false, false, false) {
#Override
public boolean getDefaultEmbeddingFlag() {
return false;
}
};
However, the problem is that at the moment this font provider would be overwritten by pdfHTML further into the pipeline in ProcessorContext#reset.
While this issue is not fixed in iText you can build a custom version of pdfHTML for your needs. The repo is located at https://github.com/itext/i7j-pdfhtml and you are interested in this line. Just replace it with the overload as above and build the jar.
UPD The fix is available starting from pdfHTML 2.1.3. From that version on you can use custom font providers freely.

Text Extraction, Not Image Extraction

Please help me understand if my solution is correct.
I'm trying to extract text from a PDF file with a LocationTextExtractionStrategy parser. I'm getting exceptions because the ParseContentMethod tries to parse inline images? The code is simple and looks similar to this:
RenderFilter[] filter = { new RegionTextRenderFilter(cropBox) };
ITextExtractionStrategy strategy = new FilteredTextRenderListener(new LocationTextExtractionStrategy(), filter);
PdfTextExtractor.GetTextFromPage(pdfReader, pageNumber, strategy);
I realize the images are in the content stream but I have a PDF file failing to extract text because of inline images. It returns an UnsupportedPdfException of "The filter /DCTDECODE is not supported" and then it finally fails with and InlineImageParseException of "Could not find image data or EI", when all I really care about is the text. The BI/EI exists in my file so I assume this failure is because of the /DCTDECODE exception. But again, I don't care about images, I'm looking for text.
My current solution for this is to add a filterHandler in the InlineImageUtils class that assigns the Filter_DoNothing() filter to the DCTDECODE filterHandler dictionary. This way I don't get exceptions when I have InlineImages with DCTDECODE. Like this:
private static bool InlineImageStreamBytesAreComplete(byte[] samples, PdfDictionary imageDictionary) {
try {
IDictionary<PdfName, FilterHandlers.IFilterHandler> handlers = new Dictionary<PdfName, FilterHandlers.IFilterHandler>(FilterHandlers.GetDefaultFilterHandlers());
handlers[PdfName.DCTDECODE] = new Filter_DoNothing();
PdfReader.DecodeBytes(samples, imageDictionary, handlers);
return true;
} catch (IOException e) {
return false;
}
}
public class Filter_DoNothing : FilterHandlers.IFilterHandler
{
public byte[] Decode(byte[] b, PdfName filterName, PdfObject decodeParams, PdfDictionary streamDictionary)
{
return b;
}
}
My problem with this "fix" is that I had to change the iTextSharp library. I'd rather not do that so I can try to stay compatible with future versions.
Here's the PDF in question:
https://app.box.com/s/7eaewzu4mnby9ogpl2frzjswgqxn9rz5

How a font is detected to be bold/italic/plain that is used in PDF

While Extracting Content from PDF using the MuPDF library, i am getting the Font name only not its font-face.
Do i guess (eg.bold in font-name though not the right way) or there is any other way to detect that specific font is Bold/Italic/Plain.
I have used itextsharp to extract font-family ,font color etc
public void Extract_inputpdf() {
text_input_File = string.Empty;
StringBuilder sb_inputpdf = new StringBuilder();
PdfReader reader_inputPdf = new PdfReader(path); //read PDF
for (int i = 0; i <= reader_inputPdf.NumberOfPages; i++) {
TextWithFont_inputPdf inputpdf = new TextWithFont_inputPdf();
text_input_File = iTextSharp.text.pdf.parser.PdfTextExtractor.GetTextFromPage(reader_inputPdf, i, inputpdf);
sb_inputpdf.Append(text_input_File);
input_pdf = sb_inputpdf.ToString();
}
reader_inputPdf.Close();
clear();
}
public class TextWithFont_inputPdf: iTextSharp.text.pdf.parser.ITextExtractionStrategy {
public void RenderText(iTextSharp.text.pdf.parser.TextRenderInfo renderInfo) {
string curFont = renderInfo.GetFont().PostscriptFontName;
string divide = curFont;
string[] fontnames = null;
//split the words from postscript if u want separate. it will be in this
}
}
public string GetResultantText() {
return result.ToString();
}
The PDF spec contains entries which allow you to specify the style of a font. However unfortunately in the real world you will often find that these are absent.
If the font is referenced rather than embeded this generally means you are stuck with the PostScript name for the font. It requires some heuristics but normally the name provides sufficient clues as to the style. It sounds this is pretty much where you are.
If the font is embedded you can parse it and try and find style information from the embedded font program. If it is subsetted then in theory this information might be removed but in general I don't think it will be. However parsing TrueType/OpenType fonts is boring and you may not feel that it is worth it.
I work on the ABCpdf .NET software component so my replies may feature concepts based around ABCpdf. It's just what I know. :-)"

How to give the option to change the Textbloack foreground-color,size in Windowsphone7

I am completely new in Windowsphone7.i have develop sample application in that i want give the option to change the font-color,Size,style(Italic/Bold) as Dynamically(like RadEditor) .please help me how to resolve this option.
If you Develop your Application MVVM style then it is not so hard to do this. You just need a property for every setting you want to set dynamically and then bind to this properties. And you create a Settings View where you can set the properties and if you change them you use INotifyPropertyChanged to broadcast that your properties value changed and so every control which is bound to that property will change and redraw.
GalaSoft MVVM
MVVM Codeplex
Easy MVVM sample Application for Windows Phone 7
The link you found to save an image looks ok, but i did it a bit different, actually from the CameraCaptureTask you already get a WritableBitmap Image and you can save it like this.
To save the Image:
private void SaveToIsolatedStorage(WriteableBitmap image, string fileName)
{
using (IsolatedStorageFile myIsolatedStorage = IsolatedStorageFile.GetUserStoreForApplication())
{
if (myIsolatedStorage.FileExists(fileName))
{
myIsolatedStorage.DeleteFile(fileName);
}
using (var stream = myIsolatedStorage.OpenFile(fileName, FileMode.Create))
{
Extensions.SaveJpeg(image, stream, image.PixelWidth, image.PixelHeight,0, 100);
}
}
}
To read the Image:
private WritableBitmap ReadFromIsolatedStorage(string fileName)
{
WriteableBitmap bitmap = new WriteableBitmap(200,200);
using (IsolatedStorageFile myIsolatedStorage = IsolatedStorageFile.GetUserStoreForApplication())
{
if (store.FileExists(fileName))
{
using (var stream = store.OpenFile(fileName,FileMode.Open))
{
bitmap.SetSource(stream);
}}
}
return bitmap;
}
I hope this will work because i wrote it from scratch. :)
And in your ViewModel you should have a WritableBitmap Property which is bound to your Image Control on your View.
To use a lot of images and work with them you should read a bit more about this, because somehow SL Images use a lot of memory so you will need to address this problem somehow in the future.