Get field value for PDRadioButton have difference with pdfbox version 3.0.21 and higher version - pdfbox

I am using pdfbox version 2.0.21 and using pdField.getValueAsString() to fetch field value.
I want to upgrade to any higher version, but have different result for PDRadioButton.
Reference tickets for version 2.0.22
PDFBOX-3683: changes made to fetch value of radio for opt.
PDFBOX-4617: new method introduced to fetch index
So, seems now we have separate methods for value and index.
Ver 2.0.21 getValueAsString returns index when pdf file have opts. (with index = 0 when none of option selected)
Ver 2.0.21 getValueAsString returns value when pdf file do not have opts.
Ver 2.0.22 getValueAsString returns value when pdf file have opts / do not have opts.
Ver 2.0.22 getSelectedIndex returns index when pdf file have opts / do not have opts.
Ver 2.0.22 getSelectedIndex returns -1 when none of option selected.
I want to keep behavior same as 2.0.21 to avoid any break.
if (pdField instanceof PDRadioButton && pdField.getCOSObject().containsKey(COSName.OPT))
fieldValue = String.valueOf(((PDRadioButton) pdField).getSelectedIndex());
else
fieldValue = pdField.getValueAsString();
Adding condition on FieldType = Radio and COSName contains OPT, to fetch selectedIndex(). It gives same result as 2.0.21, except default index 0 and -1 case.
Is PDFBOX-3683 impact only when have "/opt"? Is above condition correct way? Is there any other method equivalent to 2.0.21.getValueAsString?

Related

TagGroupGetTagType returns wrong type

Basically this is the specific question for my older question which I just can't solve.
I was given an example image for my development. This image contains a (I guess representative) TagGrop that displays the information about how the image is created.
My problem is, that TagGroupGetTagType() of the elements of this TagGroup returns 3 for elements that are TagGroups itself. But TagGroups have the type 0 (confirmed by myself and #BmyGuest in the linked question). The following image shows the output of my example script together with the tag editor dialog. As one can see that every element has the type 3, including TagGroups like Acquision or others.
The image above has been created with the following script:
clearResults();
image img;
img.GetFrontImage();
TagGroup tg = img.ImageGetTagGroup();
TagGroupOpenBrowserWindow(tg, 0);
for(number i = 0; i < tg.TagGroupCountTags(); i++){
String label = tg.TagGroupGetTagLabel(i);
number type = tg.TagGroupGetTagType(i, 0);
result("Index " + label + " has type " + type + "\n");
}
What am I doing wrong? Why does this not work? Is there any way to get the correct type?
This may be related to the file so I created an example file which is missing some of the indices (for protecting the privacy of the people who gave me this file). The posted output is in fact created with this file. So the same problem occurres. This file can be downloaded from https://www.file-upload.net/download-14020685/example.dm4.html.
(For anyone who doesn't like to download files from random pages you can get the base64 encoded file contents here: https://cutpaste.online/notes.html?id=xcix7x9e9sHxMFwF3e5h)
A confirmation & Workaround
Using your script and provided file, I can exactly reproduce the result.
Moreover, if I run the following script (in GMS 3.4):
image img := RealImage("Test",4,10,10)
taggroup newTG = NewTagGroup()
newTG.TagGroupSetTagAsString("Test","oh")
img.imagegettaggroup().TagGroupSetTagAsTagGroup("TG",newTG)
img.ShowImage()
And then run your script, I get:
Index GMS Version has type 0
Index TG has type 0
However, if I save the file and then open it up again and run your script, I suddenly get:
Index GMS Version has type 3
Index TG has type 3
So, something has clearly changed and is off. I tried some older data (all saved with GMS 3.x) and I always get a type 3 for TagGroups. I could not find data saved by GMS 2.x or GMS 1.x but would assume either or both would return type 0.
I've also noticed that the command TagGroupGetTagTypeLength returns 0 before saving, but 1 for the loaded image, and I think that might be related.
But there is a workaround you can use, which might solver your actual question.
For TagGroups (and TagLists) you can replace your check for the type by an actual attempt to get the tag as a TagGroup, as in:
clearResults();
image img;
img.GetFrontImage();
TagGroup tg = img.ImageGetTagGroup();
TagGroupOpenBrowserWindow(tg, 0);
for(number i = 0; i < tg.TagGroupCountTags(); i++){
String label = tg.TagGroupGetTagLabel(i);
number typeL = tg.TagGroupGetTagTypeLength(i);
number type = tg.TagGroupGetTagType(i, 0);
result("Index " + label + " has " + typeL + " types. Type = " + type + "\n");
TagGroup tgtest
if ( tg.TagGroupGetIndexedTagAsTagGroup(i,tgtest) )
Result("\tIt is as TagGroup (or TagList)!\n")
}

Pentaho JsonInput GET fields

I'm trying to use PDI to read data from an API (json) and now I'm simply trying to use json input to get a few specific fields but the get fields button on the input step gives me.
ERROR (version 8.3.0.0-371, build 8.3.0.0-371 from 2019-06-11 11.09.08 by buildguy) : Index 1 out of bounds for length 1
all the steps execute fine, and produce data - just not the json input step doesn't wnat to give me the fields option! - I've tired the text file and json oput and both write valid json so IDK whats going on....
PS. this is my first time using PDI
ISSUE 2:
It looks like PDI uses jayway for its json path parsing so I've been using this site https://jsonpath.herokuapp.com/ jayway selection which gives me my expected path. When I put that into the 'fields' of the json input dialog I only get the FIRST instance of that path value vs it actually parsing the json and giving me every instance, and can't figure out why though I assume it has something to do with PDI's row based view on things but I also don't know how to get it to understand that its json and it should be giving me back all values that match that path.
UPDATE 1:
I've been looking at this https://forums.pentaho.com/threads/135882-Parsing-JSON-data-without-knowing-field-names/ it seems like this Modified Java Script Value step might be the way to go. Will continue testing.
UPDATE 2
OK - Used the MJSV as posted above along with a select fields step and finally able to get the key's
var obj = JSON.parse(mydata);
var keys = Object.keys(obj);
for (var i = 0; i < Object.keys(obj).length; i++) {
var row = createRowCopy(getOutputRowMeta().size());
var idx = getInputRowMeta().size();
row[idx++] = keys[i];
putRow(row);
}
trans_Status = SKIP_TRANSFORMATION;

Add field/string length to logstash event

I'm trying to add a string length field to an index. Ideally, I'd like to use the kibana script feature as I can 'add' this field later but I keep getting a null_pointer_exception with the following code... I'm trying to sort in a visualization based on the fields length.
doc['field'].value ? doc['field'].length() : 0
Is this correct?
I thought it was because my field isn't always set (sparse data), but I added the ?:0 to combat that (which didn't work)
Any ideas?
You can define an scripted field in Kibana, of type int, language painless, and try this:
return (doc['field'].value != null? doc['field'].value.length(): 0);

Lucene check if certain docIds is in a OpenBitSetDISI

Given an instance of OpenBitSetDISI. How can I check if a single document, or a list of document ids, are present in the list. Or is iterating through the OpenBitSetDISI the only option?
OpenBitSetDISI set = new OpenBitSetDISI(filter.GetDocIdSet(reader).Iterator(), reader.MaxDoc);
Using Lucene.NET 3.0.3
It's effectively a bit array with a bit set for each docId included.
So Get(docId) should return true is the id is in the set, false if not.

Why Term.Text returns invalid data for Numericfields in Lucene.Net even though it supposed to convert accrodingly?

I was trying to return all values in order to use them later for facets as following:
TermEnum termsEnum = reader.Terms(new Term(groupByField, string.Empty));
But as soon as I added a filed like this:
NumericField tempNumericField = new NumericField("price", Field.Store.YES, true);
Term.Text started to return wrong data for the price field.
Is there a way to return all date for both Field and NumericFields?
NumericFields are stored in an encoded form (allows for correct ordering, ranges etc).
Try using NumericUtils.PrefixCodedToInt (or the appropriate method for long etc)