I am getting the below error while trying to redact pdf document using itext7
I am calling pdfCleanupTool.cleanup() method for redaction and sometimes I am getting the below error from the cleanup method:
Index was out of range. Must be non-negative and less than the size of the collection.\r\nParameter name: index
Any help appreciated.
Thanks!
Updates:
Error Log:
There is a bug in the iText 7 PdfTextArray class which generates stack traces like yours. As you don't share your PDF, though, I cannot be sure whether that's the bug bothering you currently.
The Bug
The bug can be provoked quite easily, in Java like this
PdfTextArray textArray = new PdfTextArray();
textArray.add(1);
textArray.add(-1);
textArray.add(1);
(CancelingAdjustments test testCancelingAdjustments)
and similarly in C#.
This essentially may be what happens in the OP's case; redaction involves removal of text pieces from such text arrays and replacement by equivalent numeric adjustments, so such situations may be more probable during redaction than in general.
The Cause
When adding multiple numbers to a PdfTextArray, it attempts to combine them to a single number, and if that single number is zero, remove it altogether:
public boolean add(float number) {
// adding zero doesn't modify the TextArray at all
if (number != 0) {
if (!Float.isNaN(lastNumber)) {
lastNumber = number + lastNumber;
if (lastNumber != 0) {
set(size() - 1, new PdfNumber(lastNumber));
} else {
remove(size() - 1);
}
} else {
lastNumber = number;
super.add(new PdfNumber(lastNumber));
}
lastString = null;
return true;
}
return false;
}
(PdfTextArray method add)
But this code forgets to reset the lastNumber variable to "not a number" after removal due to cancelation. Thus, this bug can be fixed like this:
public boolean add(float number) {
// adding zero doesn't modify the TextArray at all
if (number != 0) {
if (!Float.isNaN(lastNumber)) {
lastNumber = number + lastNumber;
if (lastNumber != 0) {
set(size() - 1, new PdfNumber(lastNumber));
} else {
remove(size() - 1);
lastNumber = Float.NaN;
}
} else {
lastNumber = number;
super.add(new PdfNumber(lastNumber));
}
lastString = null;
return true;
}
return false;
}
(One could improve this some more by testing whether there is some string at the now last position of the array and initialize lastString accordingly.)
The iText/.Net code is very similar here.
Related
I'm wanting to slice a range which I can do in Javascfript but am struggling in kotlin.
my current code is:
internal class blah {
fun longestPalindrome(s: String): String {
var longestP = ""
for (i in 0..s.length) {
for (j in 1..s.length) {
var subS = s.slice(i, j)
if (subS === subS.split("").reversed().joinToString("") && subS.length > longestP.length) {
longestP = subS
}
}
}
return longestP
}
and the error I get is:
Type mismatch.
Required:
IntRange
Found:
Int
Is there a way around this keeping most of the code I have?
As the error message says, slice wants an IntRange, not two Ints. So, pass it a range:
var subS = s.slice(i..j)
By the way, there are some bugs in your code:
You need to iterate up to the length minus 1 since the range starts at 0. But the easier way is to grab the indices range directly: for (i in s.indices)
I assume j should be i or bigger, not 1 or bigger, or you'll be checking some inverted Strings redundantly. It should look like for (j in i until s.length).
You need to use == instead of ===. The second operator is for referential equality, which will always be false for two computed Strings, even if they are identical.
I know this is probably just practice, but even with the above fixes, this code will fail if the String contains any multi-code-unit code points or any grapheme clusters. The proper way to do this would be by turning the String into a list of grapheme clusters and then performing the algorithm, but this is fairly complicated and should probably rely on some String processing code library.
class Solution {
fun longestPalindrome(s: String): String {
var longestPal = ""
for (i in 0 until s.length) {
for (j in i + 1..s.length) {
val substring = s.substring(i, j)
if (substring == substring.reversed() && substring.length > longestPal.length) {
longestPal = substring
}
}
}
return longestPal
}
}
This code is now functioning but unfortunately is not optimized enough to get through all test cases.
I get a Null Pointer Exception, and the trace tells me it is inside a function that I have. This function runs every frame and does some calculations and stuff. Anyway, the problem is that when I go to debug, stepping through each line, the function runs fine, and the error only comes up at the end of the draw loop.
This error only recently came up, and the changes I made don't have much to do with the function in question so...
The function mentioned in the trace detects if the object touches something, and acts on it.
Also, the trace gives a line number that does not exist, I'm guessing that's because of the Processing compiling.
I am using Processing 4 if that matters.
Here's the trace:
java.lang.NullPointerException
at ants$Ant.sense(ants.java:190)
at ants$Ant.go(ants.java:220)
at ants.draw(ants.java:44)
at processing.core.PApplet.handleDraw(PApplet.java:2201)
at processing.awt.PSurfaceAWT$10.callDraw(PSurfaceAWT.java:1422)
at processing.core.PSurfaceNone$AnimationThread.run(PSurfaceNone.java:354)
Thanks!
Edit:
More info: This is an ant simulator, and it crashes at food pickup. This used to work and broke while adding (seemingly) unrelated stuff. It crashes at food pickup, which is managed by the sense() function. The Food class only has a position and a render function.
Here is some code:
void sense() { // The problematic function
if (!detectFood && !carryFood) {
float closest = viewRadius;
Food selected = null;
for (Food fd : foods){
float foodDist = position.dist(fd.position);
if(foodDist <= viewRadius) {
if(foodDist < closest) {
selected = fd;
closest = foodDist;
}
}
}
if (selected != null){
detectFood = true;
foodFocused = selected;
}
} else {
if(position.dist(foodFocused.position) < 2*r) {
takeFood();
detectFood = false;
}
}
}
void draw() { // draw loop
background(51);
for (Food food : foods) {
food.render();
}
for (Ant ant : ants) {
ant.go();
}
for (int i=0; i < trails.size(); i++) {
Trail trail = trails.get(i);
if (trail.strenght <= 0)
trails.remove(trail);
else
trail.go();
}
}
The problem is not the trail, as it still crashes without it,
I fear I'm trying to reinvent the wheel here. I'm putting Objects into my SQFL database:
https://pub.dev/packages/sqflite
some of the object fields are Lists of ints others are Lists of Strings. I'm encoding these as plain Strings to place in a TEXT field in my SQFL database.
At some point I'm going to have to turn them back, I couldn't find anything on Google, which is surprising because this must be a very common occurrence with SQFL
I've started coding the 'decoding', but it's rookie dart. Is there anything performant around I ought to use?
Code included to prove I'm not totally lazy, no need to look, edge cases make it fail.
List<int> listOfInts = new List<int>();
String testStringOfInts = "[1,2,4]";
List<String> intermediateStep2 = testStringOfInts.split(',');
int numListElements = intermediateStep2.length;
print("intermediateStep2: $intermediateStep2, numListElements: $numListElements");
for (int i = 0; i < numListElements; i++) {
if (i == 0) {
listOfInts.add(int.parse(intermediateStep2[i].substring(1)));
continue;
}
else if ((i) == (numListElements - 1)) {
print('final element: ${intermediateStep2[i]}');
listOfInts.add(int.parse(intermediateStep2[i].substring(0, intermediateStep2[i].length - 1)));
continue;
}
else listOfInts.add(int.parse(intermediateStep2[i]));
}
print('Output: $listOfInts');
/* DECODING LISTS OF STRINGS */
String testString = "['element1','element2','element23']";
List<String> intermediateStep = testString.split("'");
List<String> output = new List<String>();
for (int i = 0; i < intermediateStep.length; i++) {
if (i % 2 == 0) {
continue;
} else {
print('adding a value to output: ${intermediateStep[i]}');
//print('value is a: ${(intermediateStep[i]).runtimeType}');
output.add(intermediateStep[i]);
}
}
print('Output: $output');
}
For the integers your could make the parsing like:
void main() {
print(parseStringAsIntList("[1,2,4]")); // [1, 2, 4]
}
List<int> parseStringAsIntList(String stringOfInts) => stringOfInts
.substring(1, stringOfInts.length - 1)
.split(',')
.map(int.parse)
.toList();
I need more information about how the Strings are saved in some corner cases like if they contain , and/or ' since this will change how the parsing should be done. But if both characters are valid in the string (especially ,) I will recommend you to change the storage format into JSON instead which makes it a lot easier to encode/decode and without the risk of using characters which can give you issues).
But a rather naive solution can be made like this if we know each String does not contain ,:
void main() {
print(parseStringAsStringList("['element1','element2','element23']"));
// [element1, element2, element23]
}
List<String> parseStringAsStringList(String stringOfStrings) => stringOfStrings
.substring(1, stringOfStrings.length - 1)
.split(',')
.map((string) => string.substring(1, string.length - 1))
.toList();
I've read somewhere that a variable should be entered into the code if it is reused. But when I write my code for logic transparency, I sometimes create intermediate variables (with names reflecting what they contain) which are used only once.
How incorrect is this concept?
PS:
I want to do it right.
It is important to note that most of the time clarity takes precedence over re-usability or brevity. This is one of the basic principles of clean code. Most modern compilers optimize code anyway so creating new variables need not be a concern at all.
It is perfectly fine to create a new variable if it would add clarity to your code. Make sure to give it a meaningful name. Consider the following function:
public static boolean isLeapYear(final int yyyy) {
if ((yyyy % 4) != 0) {
return false;
}
else if ((yyyy % 400) == 0) {
return true;
}
else if ((yyyy % 100) == 0) {
return false;
}
else {
return true;
}
}
Even though the boolean expressions are used only once, they may confuse the reader of the code. We can rewrite it as follows
public static boolean isLeapYear(int year) {
boolean fourth = year % 4 == 0;
boolean hundredth = year % 100 == 0;
boolean fourHundredth = year % 400 == 0;
return fourth && (!hundredth || fourHundredth);
}
These boolean variables add much more clarity to the code.
This example is from the Clean Code book by Robert C. Martin.
I made simplest index with one document using LuceneTestCase. My goal is to write numbers to payload for each position of each term, that will be used in custom scoring formula implemented in custom Query/Scorer.
I used SimpleTextCodec and checked, that freq, positions and payload was really written to index.
But when I'm reading freq from the PostingEnum it returns 0, payload() returns null, nextPosition() throws an exception:
java.lang.AssertionError: got line=field model
at __randomizedtesting.SeedInfo.seed([D334C9D1B5C155E3:2AAE4BE5481F4C8F]:0)
at
org.apache.lucene.codecs.simpletext.SimpleTextFieldsReader$SimpleTextPostings Enum.nextPosition(SimpleTextFieldsReader.java:455)
Here is how I'm reading the postings in the custom Query:
for (String field: fieldScores.keySet()) {
final Terms fieldTerms = reader.terms(field);
if (fieldTerms == null) {
continue;
}
if (!fieldTerms.hasPositions())
throw new IllegalStateException("Index does not contain positions");
if (!fieldTerms.hasPayloads())
throw new IllegalStateException("Index does not contain payloads");
final TermsEnum te = fieldTerms.iterator();
for (int j = 0; j < terms.length; j++) {
final Term t = terms[j];
if (t.field().equals(field) && te.seekExact(t.bytes())) {
PostingsEnum postingsEnum = te.postings(null, PostingsEnum.ALL);
int pos = postingsEnum.nextPosition();
BytesRef payload = postingsEnum.getPayload();
// assert payload.bytesEquals(new BytesRef(new byte[]{1}));
// TODO: use payload in scoring formula
fldScorers.add(new ConstTermScorer(this, t,
fieldScores.get(field) * termScores.get(t.text()),
postingsEnum));
}
}
}
I've found the reason. nextPosition(), freq() and payload() return 0 (or null) values because postingsEnum (iterator) is just created and not positioned on concrete document yet. postingsEnum.nextDoc() wasn't called and postingsEnum.docID() is -1. Stupid situation, but it would be better may be if nextPosition(), freq() and payload() would check postingsEnum.docID.