text doesnot display when trying to replace a string in a pdf using PDFBOX - pdfbox

I tried to replace a String with "XXX" in the existing pdf, using Apache PDFBOX. I was not able to replace it, when i had directly used the ReplaceString.java present in org.apache.pdfbox.examples.pdmodel
So we modified the following part.
.
.
.
else if( op.getOperation().equals( "TJ" ) )
{
COSArray previous = (COSArray)tokens.get( j-1 );
for( int k=0; k<previous.size(); k++ )
{
Object arrElement = previous.getObject( k );
if( arrElement instanceof COSString )
{
COSString cosString = (COSString)arrElement;
String string = cosString.getString();
string = string.replaceFirst( strToFind, message );
cosString.reset();
cosString.append( string.getBytes() );
}
}
}
.
.
.
Modified as:
else if( op.getOperation().equals( "TJ" ) )
{
COSArray previous = (COSArray)tokens.get( j-1 );
for( int k=0; k<previous.size(); k++ )
{
Object arrElement = previous.getObject( k );
if( arrElement instanceof COSString )
{
COSString cosString = (COSString)arrElement;
byte[] cosBytes = cosString.getBytes();
String s = new String(cosBytes, "Cp1047");
if(s.contains(strToFind))
{
String newString = s.replaceAll(strToFind, message);
cosString.reset();
cosString.append( newString.getBytes("Cp1047") );
}
}
}
}
.
.
This replaced the string with another. But in some positions of the PDF, the strings are replaced when given numerics but shows blank when replaced with alphabets.
Please help.

Related

Update PDF using pdfBox

I would like to ask if having a PDF it is possible, using pdfbox libraries, to update it at a specific point.
I am trying to use a solution already online but seems the gettoken() method does not enter code heresection the words properly to allow me to find the part I would like to modify.
This is the code(Groovy):
for( int i = 0; i < dataContext.getDataCount(); i++ ) {
InputStream is = dataContext.getStream(i);
Properties props = dataContext.getProperties(i);
String searchString= "Hours worked";
String replacement = "Hours worked: 2";
File file = new File("\\\\****\\UKDC\\GFS\\PRE\\PREPROD\\Alchemer\\Template\\***.pdf");
PDDocument doc = PDDocument.load(file);
for ( PDPage page : doc.getPages() )
{
PDFStreamParser parser = new PDFStreamParser(page);
parser.parse();
List tokens = parser.getTokens();
logger.info("in Page");
for (int j = 0; j < tokens.size(); j++)
{
logger.info("tokens:"+tokens[j]);
Object next = tokens.get(j);
//logger.info("in Object");
if (next instanceof Operator)
{
Operator op = (Operator) next;
String pstring = "";
int prej = 0;
//Tj and TJ are the two operators that display strings in a PDF
if (op.getName().equals("Tj"))
{
logger.info("in Tj");
// Tj takes one operator and that is the string to display so lets update that operator
COSString previous = (COSString) tokens.get(j - 1);
String string = previous.getString();
logger.info("previousString:"+string);
string = string.replaceFirst(searchString, replacement);
previous.setValue(string.getBytes());
} else
if (op.getName().equals("TJ"))
{
logger.info("in TJ:"+ op.getName());
COSArray previous = (COSArray) tokens.get(j - 1);
logger.info("previous:"+previous);
for (int k = 0; k < previous.size(); k++)
{
Object arrElement = previous.getObject(k);
if (arrElement instanceof COSString)
{
COSString cosString = (COSString) arrElement;
String string = cosString.getString();
logger.info("string:"+string);
if (j == prej || string.equals(" ") || string.equals(":") || string.equals("-")) {
pstring += string;
} else {
prej = j;
pstring = string;
}
}
}
logger.info("pstring:"+pstring);
if (searchString.equals(pstring.trim()))
{
logger.info("in searchString");
COSString cosString2 = (COSString) previous.getObject(0);
cosString2.setValue(replacement.getBytes());
int total = previous.size()-1;
for (int k = total; k > 0; k--) {
previous.remove(k);
}
}
}
}
}
logger.info("in updatedStream");
// now that the tokens are updated we will replace the page content stream.
PDStream updatedStream = new PDStream(doc);
OutputStream out = updatedStream.createOutputStream(COSName.FLATE_DECODE);
ContentStreamWriter tokenWriter = new ContentStreamWriter(out);
tokenWriter.writeTokens(tokens);
logger.info("in tokenWriter");
out.close();
page.setContents(updatedStream);
doc.save("\\\\***\\UKDC\\GFS\\PRE\\PREPROD\\Alchemer\\***1.pdf");
}
Executing the code I am trying to search "Hours worked" String and update with
"Hours worked: 2"
There are 2 questions:
1.When I execute and check the logs can see the Tokens are not created properly:
enter image description here
enter image description here
So are created two different COSArrays meantime I have all in one Line:
enter image description here
and this can be a problem if I have to search a specific word.
When it find the word it seems it is working but it apply a strange char:
enter image description here
So Here 2 questions:
How to manage to specify the token behaviour (or maybe for the parser) to get an entire phrase in the same token until a special char happen?
Hot to format the new char in the new PDF?
Hope you can help me, thanks for your support.

How to make a report parameter selection change another report parameter selection values while allowing multiselect for both of them?

Lets say I have a table with a list of names, such as "a, b, c" and each name has several other values assigned (some of the other values can be assigned to several/all of names values, example below).
Table example:
( names - other ):
a - aa
a - ab
a - ac
b - ab
b - bb
b - cb
c - ac
c - bc
c - cc
How do I make in birt so that I could select names in one parameter box and get corresponding other values to select for another parameter? I know it's possible to do that with cascading parameter, but that doesn't allow to have multiselect for the first parameter, which in our example would be names values.
Found a solution partly here: https://forums.opentext.com/forums/developer/discussion/61283/allow-multi-value-for-cascading-parameter
and here (thanks google translate ^^): https://www.developpez.net/forums/d1402270/logiciels/solutions-d-entreprise/business-intelligence/birt/birt-4-3-1-multi-value-cascade-parameters/
Steps to do:
change"\birt\webcontent\birt\ajax\ui\dialog\BirtParameterDialog.js" file contents (there is a file to download for replacement in one of the links, but in case it goes dead I'm sharing 2 snippets from it where the changes occur:
__refresh_cascade_select : function( element )
{
var matrix = new Array( );
var m = 0;
for( var i = 0; i < this.__cascadingParameter.length; i++ )
{
for( var j = 0; j < this.__cascadingParameter[i].length; j++ )
{
var paramName = this.__cascadingParameter[i][j].name;
if( paramName == this.__isnull )
paramName = this.__cascadingParameter[i][j].value;
if( paramName == element.id.substr( 0, element.id.length - 10 ) )
{
//CHANGE//
//The two way work (String(selectedValue) or Array(SelectedValueArray))
var l = 0;
var isFirstElement = true;
var selectedValue = "";
var selectedValueArray = new Array();
for(var test = 0; test<element.options.length;test++){
if(element.options[test].selected){
if(isFirstElement){
isFirstElement = false;
selectedValue = element.options[test].value;
}
else{
selectedValue = selectedValue+","+element.options[test].value;
}
selectedValueArray[l] = element.options[test].value;
l++;
}
}
//CHANGE//
var tempText = element.options[element.selectedIndex].text;
var tempValue = element.options[element.selectedIndex].value;
// Null Value Parameter
if ( tempValue == Constants.nullValue )
{
this.__cascadingParameter[i][j].name = this.__isnull;
this.__cascadingParameter[i][j].value = paramName;
}
else if( tempValue == '' )
{
if( tempText == "" )
{
var target = element;
target = target.parentNode;
var oInputs = target.getElementsByTagName( "input" );
if( oInputs.length >0 && oInputs[1].value != Constants.TYPE_STRING )
{
// Only String parameter allows blank value
alert( birtUtility.formatMessage( Constants.error.parameterNotAllowBlank, paramName ) );
this.__clearSubCascadingParameter( this.__cascadingParameter[i], j );
return;
}
else
{
// Blank Value
this.__cascadingParameter[i][j].name = paramName;
this.__cascadingParameter[i][j].value = tempValue;
}
}
else
{
// Blank Value
this.__cascadingParameter[i][j].name = paramName;
this.__cascadingParameter[i][j].value = tempValue;
}
}
else
{
this.__cascadingParameter[i][j].name = paramName;
//CHANGE//
//The two way work (String(selectedValue) or Array(SelectedValueArray))
this.__cascadingParameter[i][j].value = selectedValueArray;
//CHANGE//
}
for( var m = 0; m <= j; m++ )
{
if( !matrix[m] )
{
matrix[m] = {};
}
matrix[m].name = this.__cascadingParameter[i][m].name;
matrix[m].value = this.__cascadingParameter[i][m].value;
}
this.__pendingCascadingCalls++;
birtEventDispatcher.broadcastEvent( birtEvent.__E_CASCADING_PARAMETER, matrix );
}
}
}
},
and this:
// exist select control and input text/password
// compare the parent div offsetTop
if( oFirstITC.parentNode && oFirstST.parentNode )
{
// Bugzilla 265615: need to use cumulative offset for special cases
// where one element is inside a group container
var offsetITC = Position.cumulativeOffset( oFirstITC );
var offsetST = Position.cumulativeOffset( oFirstST );
// compare y-offset first, then x-offset to determine the visual order
if( ( offsetITC[1] > offsetST[1] ) || ( offsetITC[1] == offsetST[1] && offsetITC[0] > offsetST[0] ) )
{
oFirstST.focus( );
}
else
{
oFirstITC.focus( );
}
}
After .js is changed cascading parameters can have multiselect on all levels.
Example:
1st DataSet "DS_country" query:
select CLASSICMODELS.OFFICES.COUNTRY
from CLASSICMODELS.OFFICES
2nd DataSet "DS_office" query:
select CLASSICMODELS.OFFICES.OFFICECODE
from CLASSICMODELS.OFFICES
where CLASSICMODELS.OFFICES.COUNTRY IN ('DS_country')
After datasets are created we can make cascading report parameters CRP_country and CRP_office (keep in mind that UI doesn't let you to choose "Allow multiple values" on the upper levels, but we can change that in property editor after the parameters are made by going to advanced tab and changing "Scalar parameter type" property value to "Multi Value")
The only thing left is to pick lower level cascading parameters (CRP_office in our example) and go to script tab and add this on "beforeOpen":
// Do that if your selections are in String type.
var stringArray = params["CRP_country"].value.toString().split(",");
var result = "";
for(var i =0 ; i < stringArray.length ; i++){
if(i==0){
result = "'"+stringArray[i]+"'";
}
else{
result = result+",'"+stringArray[i]+"'";
}
}
// For String
this.queryText = this.queryText.replace("'DS_country'", result);
//For integer (the first part is useless)
//this.queryText = this.queryText.replace("'DS_country'",
params["CRP_country"].value);

parser.getTokens() gives out junk data and singlecharacters PDFBox-1.8.9 version

I am new to pdfbox. I am using pdfbox-app-2.0.0-RC1 version to fetch the entire text from the pdf using PDFTextStripperByArea. Is it possible for me to get each string separately?
For Example,
In the following text,
Nomination : Name&Address
Shipper : shipper name
I need Nomination as seperate string and "Name&Address" as separate string. Instead I am getting each character separately. I have tried with different Pdfs. For most pdfs I am able to get the exact string but for few pdfs I don't.
I am using the following code to get the separate string.
for (PDPage page : doc.getPages()) {
PDFStreamParser parser = new PDFStreamParser(page);
parser.parse();
List<Object> tokens = parser.getTokens();
for (int j = 0; j < tokens.size(); j++) {
Object next = tokens.get(j);
if (next instanceof Operator) {
Operator op = (Operator) next;
if (op.getName().equals("Tj")) {
COSString previous = (COSString) tokens.get(j - 1);
String string = previous.getString();
System.out.println("string1===" + string);
if (string.contains("Plant")) {
int size = al.size();
al.add(string);
stop = false;
continue;
}
if (!string.contains("_") && !stop) {
if (string.contains("Nomination")) {
stop = true;
} else {
al.add(string);
}
}
} else if (op.getName().equals("TJ")) {
COSArray previous = (COSArray) tokens.get(j - 1);
for (int k = 0; k < previous.size(); k++) {
Object arrElement = previous.getObject(k);
if (arrElement instanceof COSString) {
COSString cosString = (COSString)arrElement;
String string = cosString.getString();
System.out.println("string2====>>"+string);
al.add(string);
}
}
}
}
}
}
I am getting the following output:
string2====>>Nom
string2====>>i
string2====>>na
string2====>>t
string2====>>i
string2====>>on
string1===
string2====>>(
string2====>>T
string2====>>o
string1===
string2====>>Loa
string2====>>di
string2====>>ng
string1===
string2====>>Fa
string2====>>c
string2====>>i
string2====>>l
string2====>>i
string2====>>t
string2====>>y
string2====>>)

c++ / cll - spinner text

I have text like
I {love|enjoy|like{| reading| posting on}} {stack|stackover}. {:D |:)
|^^ |}{They{| are {forums|boards} that}|The {boards|forums|bbs}}
provide {free|awesome|infomrative|new} {guides|ideas|tutorials|posts}
(and {tools|downloads|books}!)
in String^ Szablon
and want to generate text from that.
I write a code:
while ((str = din->ReadLine()) != nullptr) //read from the file
{
int rndInt;
tekst=Szablon;
int i=0;
while (i<5){
Regex^ spin = gcnew Regex("{[^{}]*}");
String^ delimStr = "|";
Match^ m = spin->Match(tekst);
if ( m->Success )
{
array<Char>^ delimiter = delimStr->ToCharArray( );
array<String^>^ words;
words = m->Value->Split( delimiter );
words[0] = words[0]->Replace("{","");
words[words->Length-1] = words[words->Length-1]->Replace("}","");
Random^ r = gcnew Random();
int x=r->Next(0,words->Length);
String^ aktualny=words[x];
tekst = tekst->Replace(m->Value,aktualny);
}
else {i=8;}
}
MessageBox::Show(tekst);}
and it's working, but it generate me all the time the same results

Java, edit pdf exist text with PDFBox

i trying edit text with library PDFBox and don't no how. I do not know how to get stream of individual text objects, so I could edit the text and or color.
Any idea, example?
Thanks
I found editing text in pdf is not reliable, so try to clear the text with a rectangle (white/background color fill) and write the new text in the cleared position. Here is a sample code.
//to add a link in footer
//to replace a text
//to replace a link/url/href
public static void editTextorUrl(String inputFile, String outputFile)
throws IOException, COSVisitorException {
// the document
PDDocument doc = null;
try {
System.out.println(inputFile);
doc = PDDocument.load(inputFile);
List pages = doc.getDocumentCatalog().getAllPages();
for (int i = 0; i < pages.size(); i++) {
float inch = 72;
PDGamma colourRed = new PDGamma();
colourRed.setR(1);
PDGamma colourBlue = new PDGamma();
colourBlue.setB(1);
PDGamma white = new PDGamma();
white.setR(1);
white.setB(1);
white.setG(1);
PDBorderStyleDictionary borderThick = new PDBorderStyleDictionary();
borderThick.setWidth(inch / 12); // 12th inch
PDBorderStyleDictionary borderThin = new PDBorderStyleDictionary();
borderThin.setWidth(inch / 72); // 1 point
PDBorderStyleDictionary borderULine = new PDBorderStyleDictionary();
borderULine.setStyle(PDBorderStyleDictionary.STYLE_UNDERLINE);
borderULine.setWidth(inch / 72); // 1 point
PDPage page = (PDPage) pages.get(i);
PDFont font = PDType1Font.HELVETICA;
PDPageContentStream contentStream = new PDPageContentStream(
doc, page, true, false);
contentStream.setNonStrokingColor(Color.WHITE);
contentStream.fillRect(55, 27, 144, 17);
contentStream.setNonStrokingColor(Color.BLUE);
contentStream.beginText();
contentStream.setFont(font, 11);
contentStream.moveTextPositionByAmount(55, 37);
contentStream.drawString("www.loasoftwares.com"); //text to be replaced
contentStream.endText();
contentStream.setLineWidth(inch / 300);
contentStream.setStrokingColor(Color.BLUE);
contentStream.drawLine(55, 34, 188, 34);
contentStream.close();
PDAnnotationLink txtLink = new PDAnnotationLink();
PDRectangle position = new PDRectangle();
position.setLowerLeftX(55);
position.setLowerLeftY(27);
position.setUpperRightX(188);
position.setUpperRightY(50);
txtLink.setRectangle(position);
// add an action
PDActionURI action = new PDActionURI();
action.setURI("www.loasoftwares.com");
txtLink.setBorderStyle(borderULine);
txtLink.setAction(action);
txtLink.setColour(white);
page.getAnnotations().add(txtLink);
}
doc.save(outputFile);
} finally {
if (doc != null) {
doc.close();
}
}
}
Remark:
It works under specific circumstances (ASCII'ish font encodings + pretty long string arguments)
/**
* This is an example that will replace a string in a PDF with a new one.
*
* The example is taken from the pdf file format specification.
*
* #author Ben Litchfield
* #version $Revision: 1.3 $
*/
public class ReplaceString {
/**
* Constructor.
*/
public ReplaceString() {
super();
}
/**
* Locate a string in a PDF and replace it with a new string.
*
* #param inputFile The PDF to open.
* #param outputFile The PDF to write to.
* #param strToFind The string to find in the PDF document.
* #param message The message to write in the file.
*
* #throws IOException If there is an error writing the data.
* #throws COSVisitorException If there is an error writing the PDF.
*/
public void doIt( String inputFile, String outputFile, String strToFind, String message)
throws IOException, COSVisitorException {
// the document
PDDocument doc = null;
try
{
doc = PDDocument.load( inputFile );
List pages = doc.getDocumentCatalog().getAllPages();
for( int i=0; i<pages.size(); i++ )
{
PDPage page = (PDPage)pages.get( i );
PDStream contents = page.getContents();
PDFStreamParser parser = new PDFStreamParser(contents.getStream());
parser.parse();
List tokens = parser.getTokens();
for( int j=0; j<tokens.size(); j++ )
{
Object next = tokens.get( j );
if( next instanceof PDFOperator )
{
PDFOperator op = (PDFOperator)next;
//Tj and TJ are the two operators that display
//strings in a PDF
if( op.getOperation().equals( "Tj" ) )
{
//Tj takes one operator and that is the string
//to display so lets update that operator
COSString previous = (COSString)tokens.get( j-1 );
String string = previous.getString();
string = string.replaceFirst( strToFind, message );
previous.reset();
previous.append( string.getBytes("ISO-8859-1") );
}
else if( op.getOperation().equals( "TJ" ) )
{
COSArray previous = (COSArray)tokens.get( j-1 );
for( int k=0; k<previous.size(); k++ )
{
Object arrElement = previous.getObject( k );
if( arrElement instanceof COSString )
{
COSString cosString = (COSString)arrElement;
String string = cosString.getString();
string = string.replaceFirst( strToFind, message );
cosString.reset();
cosString.append( string.getBytes("ISO-8859-1") );
}
}
}
}
}
//now that the tokens are updated we will replace the
//page content stream.
PDStream updatedStream = new PDStream(doc);
OutputStream out = updatedStream.createOutputStream();
ContentStreamWriter tokenWriter = new ContentStreamWriter(out);
tokenWriter.writeTokens( tokens );
page.setContents( updatedStream );
}
doc.save( outputFile );
}
finally
{
if( doc != null )
{
doc.close();
}
}
}
/**
* This will open a PDF and replace a string if it finds it.
* <br />
* see usage() for commandline
*
* #param args Command line arguments.
*/
public static void main(String[] args)
{
ReplaceString app = new ReplaceString();
try
{
if( args.length != 4 )
{
app.usage();
}
else
{
app.doIt( args[0], args[1], args[2], args[3] );
}
}
catch (Exception e)
{
e.printStackTrace();
}
}
/**
* This will print out a message telling how to use this example.
*/
private void usage()
{
System.err.println( "usage: " + this.getClass().getName() +
" <input-file> <output-file> <search-string> <Message>" );
}
}