Get all values from property with LINQ from JSON? - vb.net

I'm having trouble sorting out the exact syntax to properly query my JSON response.
My API endpoint returns some JSON as follows:
{
"status": "Succeeded",
"recognitionResults": [
{
"page": 1,
"clockwiseOrientation": 0.14,
"width": 2835,
"height": 2241,
"unit": "pixel",
"lines": [
{
"boundingBox": [
25,
11,
324,
15,
323,
51,
24,
46
],
"text": "Custom Report",
"words": [
{
"boundingBox": [
37,
11,
171,
14,
172,
49,
38,
48
],
"text": "Custom"
},
{
"boundingBox": [
193,
15,
322,
17,
323,
49,
194,
49
],
"text": "Report"
}
]
}...
What is important to note is that is that the root element will only contain 1 recognitionResults array. Inside this array it will contain many arrays of lines. Inside each line I have a property of text, I'll also have a property of words that also contains a property of text. I'm only concerned with the property text that is the direct child of lines.
I'm attempting to select all of the text properties into a list of strings.
vb.net code:
File.WriteAllText(Path.GetFileName(strFilePath) & ".json", JToken.Parse(strResult).ToString())
Dim c1 As JArray = CType(tmpObj("recognitionResults"), JArray)
Dim c2 = (From s In c1.Children() Select s("text")).ToList()
This throws an exception that the JArray has an invalid key; an int is expected.
I also thought I could just query it with LINQ directly:
Dim c3 = (From s In tmpObj Select s("text")).ToList()
This throws an exception that it cannot access a child value on Newtonsoft.Json.Linq.Jproperty
Lastly, I've also tried this:
Dim c2 = (From p In tmpObj("recognitionResults")("lines").Children() Select p("text"))
I'm really stuck at this point. I think I just have a syntax problem in how I am trying to select. Can someone point me in the right direction?

You can use SelectTokens with a wildcard to get the information you want from the JSON more easily:
Dim token As JToken = JToken.Parse(json)
Dim lines As List(Of String) = _
(From t In token.SelectTokens("..lines[*].text") Select CStr(t)).ToList()
Working demo: https://dotnetfiddle.net/Zlenga

Related

AWS mqtt SQL query

I have the following mqtt message:
{
"sensors": [
{
"lsid": 412618,
"data": [
{
"temp_in": 72.3,
"heat_index_in": 72,
"dew_point_in": 55.9,
"ts": 1652785241,
"hum_in": 56.3
}
],
"sensor_type": 243,
"data_structure_type": 12
},
{
"lsid": 421195,
}
I can get the "sensors,0.lsid" value and the entire "data" array using this query:
select get(sensors,0).lsid as ls, get(sensors, 0).data as data1 from "topic"
but what I really need is to get "temp_in:72.3" , i.e. the values from the second level array
I've tried using this :AWS Doc., but unless I'm not following it correctly, it doesn't seem to work.
Any help would be greatly appreciated

Add two json values dynamically in azure data factory

I need to add two json value which is coming dynamically from one activity and one variable value of pipeline in data factory.
I am doing it like this as below.
#union(activity('Get Order Events Data').output, json('{"orig_orderID" : "variables('orderid')"}'))
But it is showing error.
Missing comma between arguments
What i am doing wrong here.
But it is showing error. Missing comma between arguments
This is the expression variables('orderid') has ' in it which splits your expression.
You should use concat() function to do this #union(activity('Get Order Events Data').output, json(concat('{"orig_orderID" :',variables('orderid'),'}'))). But this
expression can't get your expected result due to it wouldn't add in your data. It would be like this:
{
"data": [
{
"id": 145,
"order_id": 256,
"created_at": "2021-06-20T11:48:20Z",
"type": 10,
"sender": -1,
"message": null,
"previous_status": 4,
"fas_user_id": null,
"event_data": "5",
"shopkeeper_timestamp": null,
"store_id": 123
}
],
"orig_orderID": "860"
}
You can try the following expression:#union(activity('Get Order Events Data').output.data[0], json(concat('{"orig_orderID" :',variables('orderid'),'}')))
it can get the result:
{
"id": 145,
"order_id": 256,
"created_at": "2021-06-20T11:48:20Z",
"type": 10,
"sender": -1,
"message": null,
"previous_status": 4,
"fas_user_id": null,
"event_data": "5",
"shopkeeper_timestamp": null,
"store_id": 123,
"orig_orderID": "860"
}

Extract element from output array in a Copy Data activity

I have a copy data activity that dynamically adds a datetime suffix to the sink file name, which is based on utcnow(). This corresponds to the start datetime in the copy data activity. I am looking to extract the 'start' element from the executionDetails array in the output:
{
"dataRead": 0,
"dataWritten": 86,
"filesWritten": 1,
"sourcePeakConnections": 1,
"sinkPeakConnections": 1,
"rowsRead": 0,
"rowsCopied": 0,
"copyDuration": 4,
"throughput": 0,
"errors": [],
"effectiveIntegrationRuntime": "FXL",
"usedParallelCopies": 1,
"executionDetails": [
{
"source": {
"type": "SqlServer"
},
"sink": {
"type": "AzureBlobFS"
},
"status": "Succeeded",
"start": "2019-08-06T12:29:20.477586Z",
"duration": 4,
"usedParallelCopies": 1,
"detailedDurations": {
"queuingDuration": 3,
"transferDuration": 1
}
}
]
}
Assuming the activity is called CopyData, I want to set the value of start to a variable. I am struggling to get this, a simple #activity('CopyData').output.executionDetails.start does not work, telling me to assign an integer value of the executionDetails array. However trying #activity('CopyData').output.executionDetails[3] errors telling me the range is (0,0). I am looking for a method to extract the datetimestamp into a string variable.
I can store executionDetails in an array variable, but still unable thereafter to extract the start value.
Already worked it out, there range is 0,0 because there is only 1 array in executionDetails containing various values. So, I just need to call the array with [0] and then call the start value, so:
#activity('CopyData').output.executionDetails[0].start

AWS boto3 page_iterator.search can't compare datetime.datetime to str

Trying to capture delta files(files created after last processing) sitting on s3. To do that using boto3 filter iterator by query LastModified value rather than returning all the list of files and filtering on the client site.
According to http://jmespath.org/?, the below query is valid and filters the following json respose;
filtered_iterator = page_iterator.search(
"Contents[?LastModified>='datetime.datetime(2016, 12, 27, 8, 5, 37, tzinfo=tzutc())'].Key")
for key_data in filtered_iterator:
print(key_data)
However it fails with;
RuntimeError: xxxxxxx has failed: can't compare datetime.datetime to str
Sample paginator reponse;
{
"Contents": [{
"LastModified": "datetime.datetime(2016, 12, 28, 8, 5, 31, tzinfo=tzutc())",
"ETag": "1022dad2540da33c35aba123476a4622",
"StorageClass": "STANDARD",
"Key": "blah1/blah11/abc.json",
"Owner": {
"DisplayName": "App-AWS",
"ID": "bfc77ae78cf43fd1b19f24f99998cb86d6fd8220dbfce0ce6a98776253646656"
},
"Size": 623
}, {
"LastModified": "datetime.datetime(2016, 12, 28, 8, 5, 37, tzinfo=tzutc())",
"ETag": "1022dad2540da33c35abacd376a44444",
"StorageClass": "STANDARD",
"Key": "blah2/blah22/xyz.json",
"Owner": {
"DisplayName": "App-AWS",
"ID": "bfc77ae78cf43fd1b19f24f99998cb86d6fd8220dbfce0ce6a81234e632c5a8c"
},
"Size": 702
}
]
}
Boto3 Jmespath implementation does not support dates filtering (it will mark them as incompatible types "unicode" and "datetime" in your example). But by the way Dates are parsed by Amazon you can perform lexographical comparison of them using to_string() method of Jmespath.
Something like this:
"Contents[?to_string(LastModified)>='\"2015-01-01 01:01:01+00:00\"']"
But keep in mind that its a lexographical comparison and not dates comparison. Works most of the time tho.
After spend a few minutes on boto3 paginator documentation, I just realist it is actually an syntax problem, which I overlook it as a string.
Actually, the quote that embrace comparison value on the right is a backquote/backtick, symbol [ ` ] . You cannot use single quote [ ' ] for the comparison values/objects.
After inspect JMESPath example, I notice it is using backquote for comparative value. So boto3 paginator implementation indeed comply to JMESPath standard.
Here is the code I run without error using the backquote.
import boto3
s3 = boto3.client("s3")
s3_paginator = s3.get_paginator('list_objects')
s3_iterator = s3_paginator.paginate(Bucket='mytestbucket')
filtered_iterator = s3_iterator.search(
"Contents[?LastModified >= `datetime.datetime(2016, 12, 27, 8, 5, 37, tzinfo=tzutc())`].Key"
)
for key_data in filtered_iterator:
print(key_data)

How to make a intersect in SOLR?

I am implementing a solr project, with the structure below of my indexed object.
{
"OBJECT_HEADER_ID": 173604,
"CHARACTERISTIC_VALUE_ID": 143287,
"OBJECT_TYPE_ID": 1,
"SEQUENCE": 0,
"CHARACTERISTIC_ID": 1488,
"OBJECT_VARIANT_ID": 169941,
"ID": "84445897",
"TYPE": 0
},
{
"OBJECT_HEADER_ID": 173604,
"CHARACTERISTIC_VALUE_ID": 23502,
"OBJECT_TYPE_ID": 1,
"SEQUENCE": 0,
"CHARACTERISTIC_ID": 992,
"OBJECT_VARIANT_ID": 169941,
"ID": "84445898",
"TYPE": 0
}
And I need to make a intersect between various results in sub queries, and I don't found nothing on the WEB about how to make the query, for example:
-> Get all results that have (CHARACTERISTIC_ID = 1488 and CHARACTERISTIC_VALUE_ID = 143287) INTERSECT BY OBJECT_VARIANT_ID WITH (CHARACTERISTIC_ID = 992 and CHARACTERISTIC_VALUE_ID = 23502).