XMLUnit NodeFilter not finding difference - xmlunit-2

Comparing two simple XMLs and want to compare nodes with specific localName. In this case only flowerA. When i use not equals to flowerB it gives me the difference for flowerA node, when i say equals to flowerA it doesnt give any difference?
public class XMLDiff {
public static void main(String[] args) {
String controlXml = "<flowers><flowerA>Rose</flowerA><flowerB>Daisy</flowerB></flowers>";
String testXml = "<flowers><flowerA>Roses</flowerA><flowerB>Daisies</flowerB></flowers>";
Diff build = DiffBuilder.compare(controlXml).withTest(testXml)
.ignoreWhitespace()
.withNodeFilter(node -> !node.getNodeName().equals("flowerB"))
.build();
System.out.println(build.getDifferences());
}
}
[Expected text value 'Rose' but was 'Roses' - comparing <flowerA ...>Rose</flowerA> at /flowers[1]/flowerA[1]/text()[1] to <flowerA ...>Roses</flowerA> at /flowers[1]/flowerA[1]/text()[1] (DIFFERENT)]
public class XMLDiff {
public static void main(String[] args) {
String controlXml = "<flowers><flowerA>Rose</flowerA><flowerB>Daisy</flowerB></flowers>";
String testXml = "<flowers><flowerA>Roses</flowerA><flowerB>Daisies</flowerB></flowers>";
Diff build = DiffBuilder.compare(controlXml).withTest(testXml)
.ignoreWhitespace()
.withNodeFilter(node -> node.getNodeName().equals("flowerA"))
.build();
System.out.println(build.getDifferences());
}
}
[]

Your root element flowers is not matched by the NodeFilter and so you end up with comparing nothing at all.
NodeFilter is best suited for a deny list of nodes you do not want to compare. In your case you need to ensure you also allow all nodes that are encountered while traversing to the node you are interested in.

Related

ArrayIndexOutOfBounds exception for an empty string on #RestQuery

I have a handler in my rest easy lamda application
with the url: http://localhost:8080/store/search/v1/suggest
public List<Map<String, Object>> storeSearch(
#RestHeader("id") String id,
#RestQuery final String q,
#RestQuery final String columns) {
if (StringUtils.isBlank(q)) {
logger.error("Search query is empty");
return Collections.EMPTY_LIST;
}
The app is doing well for below cases but failing if q is empty
http://localhost:8080/store/search/v1/suggest?q=ab
http://localhost:8080/store/search/v1/suggest
failing here
http://localhost:8080/store/search/v1/suggest?q=
Can you please suggest me what I am missing here.

Data is written to BigQuery but not in proper format

I'm writing data to BigQuery and successfully gets written there. But I'm concerned with the format in which it is getting written.
Below is the format in which the data is shown when I execute any query in BigQuery :
Check the first row, the value of SalesComponent is CPS_H but its showing 'BeamRecord [dataValues=[CPS_H' and In the ModelIteration the value is ended with a square braket.
Below is the code that is used to push data to BigQuery from BeamSql:
TableSchema tableSchema = new TableSchema().setFields(ImmutableList.of(
new TableFieldSchema().setName("SalesComponent").setType("STRING").setMode("REQUIRED"),
new TableFieldSchema().setName("DuetoValue").setType("STRING").setMode("REQUIRED"),
new TableFieldSchema().setName("ModelIteration").setType("STRING").setMode("REQUIRED")
));
TableReference tableSpec = BigQueryHelpers.parseTableSpec("beta-194409:data_id1.tables_test");
System.out.println("Start Bigquery");
final_out.apply(MapElements.into(TypeDescriptor.of(TableRow.class)).via(
(MyOutputClass elem) -> new TableRow().set("SalesComponent", elem.SalesComponent).set("DuetoValue", elem.DuetoValue).set("ModelIteration", elem.ModelIteration)))
.apply(BigQueryIO.writeTableRows()
.to(tableSpec)
.withSchema(tableSchema)
.withCreateDisposition(CreateDisposition.CREATE_IF_NEEDED)
.withWriteDisposition(WriteDisposition.WRITE_TRUNCATE));
p.run().waitUntilFinish();
EDIT
I have transformed BeamRecord into MyOutputClass type using below code and this also doesn't work:
PCollection<MyOutputClass> final_out = join_query.apply(ParDo.of(new DoFn<BeamRecord, MyOutputClass>() {
private static final long serialVersionUID = 1L;
#ProcessElement
public void processElement(ProcessContext c) {
BeamRecord record = c.element();
String[] strArr = record.toString().split(",");
MyOutputClass moc = new MyOutputClass();
moc.setSalesComponent(strArr[0]);
moc.setDuetoValue(strArr[1]);
moc.setModelIteration(strArr[2]);
c.output(moc);
}
}));
It looks like your MyOutputClass is constructed incorrectly (with incorrect values). If you look at it, BigQueryIO is able to create rows with correct fields just fine. But those fields have wrong values. Which means that when you call .set("SalesComponent", elem.SalesComponent) you already have incorrect data in the elem.
My guess is the problem is in some previous step, when you convert from BeamRecord to MyOutputClass. You would get a result similar to what you're seeing if you did something like this (or some other conversion logic did this for you behind the scenes):
convert BeamRecord to string by calling beamRecord.toString();
if you look at BeamRecord.toString() implementation you can see that you're getting exactly that string format;
split this string by , getting an array of strings;
construct MyOutputClass from that array;
Pseudocode for this is something like:
PCollection<MyOutputClass> final_out =
beamRecords
.apply(
ParDo.of(new DoFn() {
#ProcessElement
void processElement(Context c) {
BeamRecord record = c.elem();
String[] fields = record.toString().split(",");
MyOutputClass elem = new MyOutputClass();
elem.SalesComponent = fields[0];
elem.DuetoValue = fields[1];
...
c.output(elem);
}
})
);
Correct way of doing something like this is to call getters on the record instead of splitting its string representation, along these lines (pseudocode):
PCollection<MyOutputClass> final_out =
beamRecords
.apply(
ParDo.of(new DoFn() {
#ProcessElement
void processElement(Context c) {
BeamRecord record = c.elem();
MyOutputClass elem = new MyOutputClass();
//get field value by name
elem.SalesComponent = record.getString("CPS_H...");
// get another field value by name
elem.DuetoValue = record.getInteger("...");
...
c.output(elem);
}
})
);
You can verify something like this by adding a simple ParDo where you either put a breakpoint and look at the elements in the debugger, or output the elements somewhere else (e.g. console).
I was able to resolve this issue using below methods :
PCollection<MyOutputClass> final_out = record40.apply(ParDo.of(new DoFn<BeamRecord, MyOutputClass>() {
private static final long serialVersionUID = 1L;
#ProcessElement
public void processElement(ProcessContext c) throws ParseException {
BeamRecord record = c.element();
String strArr = record.toString();
String strArr1 = strArr.substring(24);
String xyz = strArr1.replace("]","");
String[] strArr2 = xyz.split(",");

Hibernate search boolean filter

I have book entry:
#Entity
#Indexed
public class Book extends BaseEntity {
#Field
private String subtitle;
#DateBridge(resolution = Resolution.DAY)
private Date publicationDate;
#Field
private int score;
#IndexedEmbedded
#ManyToMany(fetch = FetchType.EAGER)
#Cascade(value = {CascadeType.ALL})
private List<Author> authors = new ArrayList<Author>();
#Field
#FieldBridge(impl = BooleanBridge.class)
private boolean prohibited;
And filter by boolean field "phohibited"
public class BFilter extends Filter {
#Override
public DocIdSet getDocIdSet(IndexReader indexReader) throws IOException {
OpenBitSet bitSet = new OpenBitSet(indexReader.maxDoc());
TermDocs termDocs = indexReader.termDocs(new Term("prohibited","false"));
while (termDocs.next()) {
bitSet.set(termDocs.doc());
}
return bitSet;
}
}
Search method
public List<T> findByQuery(Class c, String q) throws InterruptedException {
FullTextSession fullTextSession = Search.getFullTextSession(session);
fullTextSession.createIndexer().startAndWait();
QueryBuilder qb = fullTextSession.getSearchFactory().buildQueryBuilder().forEntity(c).get();
Query luceneQuery = qb
.keyword()
.fuzzy()
.onFields("title", "subtitle", "authors.name", "prohibited", "score")
.matching(q)
.createQuery();
FullTextQuery createFullTextQuery = fullTextSession.createFullTextQuery(luceneQuery, Book.class, BaseEntity.class);
createFullTextQuery.setFilter(new BFilter());
return createFullTextQuery.list();
}
if I apply that filter - search result is empty. Entries in the database 100% there. What am I doing wrong? If you replace the filter field to "score" that all works, and the result is not empty. Do not search it on a Boolean field
The basic approach looks ok. A couple of comments. You are calling the indexer for each findByQuery call. Not sure whether this is just some test code, but you should index before you search and only once or when things change (you can also use automatic index updates). It might also be that depending on your transaction setup, your search cannot see the indexed data. However, you seem to say that all works if you don't use a filter at all. In this case I would add some debug to the filter or debug it to see what's going on and if it gets called at all. Last but not least, you don't need to explicitly set explicitly #FieldBridge(impl = BooleanBridge.class).

how to assign a sequentially increasing number to a relation

I am wondering if there is a way to create a new field in a relation and then assign some sequentially increasing number to it? Here is one example:
ordered_products = ORDER products BY price ASC;
ordered_products_with_sequential_id = FOREACH ordered_products GENERATE price, some_sequential_id;
How can I create some_sequential_id? I am not sure whether that's doable in Pig though.
I suspect you have to write your own UDF to get that running. One way to do it would be by incrementing a static variable (AtomicInteger) in your UDF in the exec implementation.
public class IncrEval extends EvalFunc<Long> {
final static AtomicLong res = new AtomicLong(0);
#Override
public Long exec (Tuple tip) throws IOException {
if (tip == null || tip.size() == 0) {
return null;
}
res.incrementAndGet();
return res.longValue();
}
}
Pig script entry:
b = FOREACH a GENERATE <something>, com.eval.IncrEval() AS ID:long;

Java 8 Streams - Reading, logic and printing to file in a single stream

I'd like to read a file with some strings and numbers then filter numbers from strings, append all numbers in every line and print it into a new file.
I know it may be a bit challenging but is it even possible to do such a thing in only one stream?
e.g input
some numbers 1 2 3
number 4 5 6
and few more number 7 8 9
e.q output
1+2+3 = 6
4+5+6 = 15
7+8+9 = 24
My main class for testing
public class MainFiles {
public static void main(String[] args) {
String filePath = "src/test/recources/1000.txt";
new FileProcesorStream().fileReader(filePath);
}
}
and so far I did smth like this.
import java.util.stream.Stream;
public class FileProcesorStream {
public void fileReader(String fileName) {
try (Stream<String> streamReader = Files.lines(Paths.get(fileName));
PrintWriter printWriter = new PrintWriter("resultStream.txt")) {
streamReader
.filter(line -> line.matches("[\\d\\s]+"))
.forEachOrdered(printWriter::println);
} catch (IOException e) {
System.out.println("File not found");
e.printStackTrace();
}
}
}
Is this at least a good start to solve this problem or if not with what should I start or what should I change?
Of course, if it's it possible only with a single stream.
You can use String::replaceAll along with the regex [^\\d\\s]+ which after that specific map operation should provide us a string containing only the whitespaces and the numbers, which we then perform a String::trim operation to remove the leading and trailing whitespace.
Following that we compute the sum of the numbers for each of the lines.
streamReader
.map(line -> line.replaceAll("[^\\d\\s]+", ""))
.map(String::trim)
.mapToInt(line -> Arrays.stream(line.split(" "))
.map(String::trim)
.filter(e -> !e.isEmpty())
.mapToInt(Integer::parseInt)
.sum()
)
.forEachOrdered(printWriter::println);
In java-9 if you even will upgrade, there a simpler way to get those sums (via Scanner#findAll) :
streamReader.map(Scanner::new)
.mapToInt(s -> s.findAll("\\d+")
.map(MatchResult::group)
.mapToInt(Integer::parseInt)
.sum())
.forEachOrdered(System.out::println);