How to map Elasticsearch Spring Data AggregationsContainer contents to custom model? - kotlin

I am using Elsaticsearch Spring Data. I have a custom repository that uses ElasticsearchOperations based on examples on docs. I need some aggregation query results and I successfully get the intended results. but I need to map those results to a model. But currently I'm unable to access contents of AggregationsContainer.
override fun getStats(startTime: Long, endTime: Long, pageable: Pageable): AggregationsContainer<*>?
{
val query: Query = NativeSearchQueryBuilder()
.withQuery(QueryBuilders.rangeQuery("time").from(startTime).to(endTime))
.withAggregations(AggregationBuilders.sum("discount").field("discount"))
.withAggregations(AggregationBuilders.sum("price").field("price"))
.withPageable(pageable)
.build()
val searchHits: SearchHits<Product> = operations.search(query, Product::class.java)
return searchHits.aggregations
}
I return the result of the following code:
val stats = repository.getTotalStats(before, currentTime, pageable)?.aggregations()
the result is :
{
"asMap": {
"discount": {
"name": "discount",
"metadata": null,
"value": 8000.0,
"valueAsString": "8000.0",
"type": "sum",
"fragment": true
},
"price": {
"name": "price",
"metadata": null,
"value": 9000.0,
"valueAsString": "9000.0",
"type": "sum",
"fragment": true
}
},
"fragment": true
}
How can I convert above output to an intended output model like following? as I tested contents of aggregations() are inaccessible and the type is Any :
{
"priceSum":9000.0,
"discountSum":8000
}

There is no data model in the Elasticsearch RestHighLevelClient classes for aggregations, and there is no on in Spring Data Elasticsearch. Therefore the original Aggregations object is returned to the caller (contained in that AggregationContainer, because that will change with new new client implementation, and then the container will hold a different object).
You have to parse this by yourself, I had something in the answer of another question (https://stackoverflow.com/a/63105356/4393565). The interesting thing for you is the last codeblock where the aggregations are passed. You basically have to iterate over the elements, cast them to the appropriate type and evaluate them.

Related

GCP Dataflow JOB REST response add displayData object with { "key":"datasetName", ...}

Why this code of line doesn't generate displayData object with { "key":"datasetName", ...} and how I can generate it if it's not coming by default when using BigQuery source from apache beam?
bigqcollection = p | 'ReadFromBQ' >> beam.io.Read(beam.io.BigQuerySource(project=project,query=get_java_query))
[UPDATE] Adding result that I try to produce:
"displayData": [
{
"key": "table",
"namespace": "....",
"strValue": "..."
},
{
"key": "datasetName",
"strValue": "..."
}
]
From reading the implementation of display_data() for a BigQuerySource in the most recent version of Beam, it does not extract the table and dataset from the query, which your example uses. And more significantly, it does not create any fields specifically named datasetName.
I would recommend writing a subclass of _BigQuerySource which adds the fields you need to the display data, while preserving all the other behavior.

Returning unknown JSON in a query

Here is my scenario. I have data in a Cosmos DB and I want to return c.this, c.that etc as the indexer for Azure Cognitive Search. One field I want to return is JSON of an unknown structure. The one thing I do know about it is that it is flat. However it is my understanding that the return value for an indexer needs to be known. How, using SQL in a SELECT, would I return all JSON elements in the flat object? Here is an example value I would be querying:
{
"BusinessKey": "SomeKey",
"Source": "flat",
"id": "SomeId",
"attributes": {
"Source": "flat",
"Element": "element",
"SomeOtherElement": "someOtherElement"
}
}
So I would want my select to be maybe something like:
SELECT
c.BusinessKey,
c.Source,
c.id,
-- SOMETHING HERE TO LIST OUT ALL ATTRIBUTES IN THE JSON AS FIELDS IN THE RESULT
And I would want the result to be:
{
"BusinessKey": "SomeKey",
"Source": "flat",
"id": "SomeId",
"attributes": [{"Source":"flat"},{"Element":"element"},{"SomeOtherElement":"someotherelement"}]
}
Currently we are calling ToString on the c.attributes, which is the JSON of unknown structure but it is adding all the escape characters. When we want to search the index, we have to add all those escape characters and it's getting really unruly.
Is there a way to do this using SQL?
Thanks for any help!
You could use UDF in cosmos db sql.
UDF code:
function userDefinedFunction(object){
var returnArray = [];
for (var key in object) {
var map = {};
map[key] = object[key];
returnArray.push(map);
}
return returnArray;
}
Sql:
SELECT
c.BusinessKey,
c.Source,
c.id,
udf.test(c.attributes) as attributes
from c
Output:

Use sprintf syntax inside logstash's sprintf syntax

For the below data structure:
{
"sprints": [
{
"id": 17193,
"name": "Sprint 12"
},
{
"id": 16510,
"name": "Sprint 11"
}
],
"velocityStatEntries": {
"16510": {
"estimated": {
"value": 49
},
"completed": {
"value": 36
}
},
"17193": {
"estimated": {
"value": 52
},
"completed": {
"value": 70
}
}
}
}
Given this, I want to be able to produce an Elasticsearch object that's easier to handle, by adding the values of the Estimated and Completed fields to the sprints with their matching IDs.
Ideally, I would like to handle this without writing Ruby, but I am not finding a logstash-native solution that handles this scnenario.
First, I split the data on the sprints field using split, so, I only have a single sprints object, and can use [sprints][id] to know what sprint I'm processing.
Then, I have attempted to work with the mutate filter, in one of two ways:
- using merge to add the [velocityStateEntries][] object to the
current sprint
- using add_field to add the two fields I need
Syntactically, is this possible? Ideally, I would want to be able to do a 'double substitution' of sorts, obtaining the estimated time for the current sprint something like:
add_field => {
"estimatedTime" => "%{[velocityStatEntries][%{[sprints][id]}][estimated][value]}"
}
but this only seems to work with a hardcoded format such as "estimatedTime" => "%{[velocityStatEntries][1234][estimated][value]}"
Do I have to use the Ruby format for this?
For what it's worth, the Ruby solution is very simple:
ruby {
code => "
sprintId = event.get('[sprints][id]');
estimated = event.get('[velocityStatEntries]['+(sprintId).to_s+'][estimated][value]');
completed = event.get('[velocityStatEntries]['+(sprintId).to_s+'][completed][value]');
event.set('[sprints][estimatedUnits]', estimated);
event.set('[sprints][completedUnits]', completed);
"
}

Figure out different values to send partial update to server

From a form submission I receive two objects: the original values and the dirty values. I like to figure out how to create a diff to send to the server using the following rules:
id field of the root object should always be included
all changed primitive values should be included
all nested changes should be included as well.
if a nested value other than id changed, it should include id as well.
Original values:
{
"id":10,
"name": "tkvw"
"locale": "nl",
"address":{
"id":2,
"street": "Somewhere",
"zipcode": "8965",
},
"subscriptions":[8,9,10],
"category":{
"id":6
},
}
Example expected diff objects:
1) User changes field name to "Foo"
{
"id":10,
"name":"foo"
}
2) User changes field street on address node and category
{
"id":10,
"address":{
"id": 2,
"street":"Changed"
},
"category":{
"id":5
}
}
I do understand the basics of functional programming, but I just need a hint in the right direction (some meta code maybe).
Take a look at JSON Patch (rfc6902), JSON Patch is a format for describing changes to a JSON document. For example:
[
{ "op": "replace", "path": "/baz", "value": "boo" },
{ "op": "add", "path": "/hello", "value": ["world"] },
{ "op": "remove", "path": "/foo"}
]
You generate a patch by comparing to JS objects/arrays, and then you can apply the patch to the original object (on the server side for example) to reflect changes.
You can create a patch using the fast-json-patch lib.
const obj1 = {"id":10,"name":"tkvw","locale":"nl","address":{"id":2,"street":"Somewhere","zipcode":"8965"},"subscriptions":[8,9,10],"category":{"id":6}};
const obj2 = {"id":10,"name":"cats","locale":"nl","address":{"id":2,"street":"Somewhere","zipcode":"8965"},"subscriptions":[8,9,10,11],"category":{"id":7}};
const delta = jsonpatch.compare(obj1, obj2);
console.log('delta:\n', delta);
const doc = jsonpatch.applyPatch(obj1, delta).newDocument;
console.log('patched obj1:\n', doc);
<script src="https://cdnjs.cloudflare.com/ajax/libs/fast-json-patch/2.0.6/fast-json-patch.min.js"></script>

Filter an object array to modify json with circe

I am evaluating Circe and couldn't find out how to use filter for arrays to transform a JSON. I read the guide on its website and API doc, still no clue. Help much appreciated.
Sample data:
{
"Department" : "HR",
"Employees" :[{ "name": "abc", "age": 25 }, {"name":"def", "age" : 30 }]
}
Task:
How to use a filter for Employees to transform the JSON to another JSON, for example, all employees with age older than 50?
For some reason I can't filter from data source before JSON is generated, in case you ask.
Thanks
One possible way of doing this is by
val data = """{"Department" : "HR","Employees" :[{ "name": "abc", "age": 25 }, {"name":"def", "age":30}]}"""
def ageFilter(j:Json): Json = j.withArray { x =>
Json.fromValues(x.filter(_.hcursor.downField("age").as[Int].map(_ > 26).getOrElse(false)))
}
val y: Either[ParsingFailure, Json] = parse(data).map( _.hcursor.downField("Employees").withFocus(ageFilter).top.get)
println(s"$y")