Query for entire JSON document in nested JSON schema - sql

Background:
I wish to locate the entire JSON document that has a condition where "state" = "new" and where length(Features.id) > 4
{
"id": "123"
"feedback": {
"Features": [
{
"state": "new"
"id": "12345"
}
]
}
}
This is what I have tried to do:
Since this is a nested document. My query looks like this:
A stackoverflow member has helped me to access the nested contents within the query, but is there a way to obtain the full document
I have used:
SELECT VALUE t.id FROM t IN f.feedback.Features where t.state = 'new' and length(t.id)>4
This will give me the ids.
My desire is to have access to the full document with this condition?
{
"id": "123"
"feedback": {
"Features": [
{
"state": "new"
"id": "12345"
}
]
}
}
Any help is appreciated

Try this
SELECT *
FROM f
WHERE
f.feedback.Features[0].state = 'new'
AND length(f.feedback.Features[0].id)>4
Here is the SELECT spec for CosmosDB for more details
https://learn.microsoft.com/en-us/azure/cosmos-db/sql-query-select
Also, check out "working with JSON" in CosmosDB notes
https://learn.microsoft.com/en-us/azure/cosmos-db/sql-query-working-with-json
If the Features array has more than 1 value, you can use EXISTS clause to search within them. See specs of EXISTS here with examples:
https://learn.microsoft.com/en-us/azure/cosmos-db/sql-query-subquery#exists-expression

Related

Using PostgreSQL JSON function to obtain an array object from a JSON stored key

I have a table on AWS RDS PostgreSQL that stores JSON objects. For instance I have this registry:
{
"id": "87b05c62-4153-4341-9b58-e86bade25ffd",
"title": "Just Ok",
"rating": 2,
"gallery": [
{
"id": "1cb158af-0983-4bac-9e4f-0274b3836cdd",
"typeCode": "PHOTO"
},
{
"id": "aae64f19-22a8-4da7-b40a-fbbd8b2ef30b",
"typeCode": "PHOTO"
}
],
"reviewer": {
"memberId": "2acf2ea7-7a37-42d8-a019-3d9467cbdcd1",
},
"timestamp": {
"createdAt": "2011-03-30T09:52:36.000Z",
"updatedAt": "2011-03-30T09:52:36.000Z"
},
"isUserVerified": true,
}
And I would like to create a query for obtaining one of the gallery objects.
I have tried this but get both objects in the array:
SELECT jsonb_path_query(data->'gallery', '$[*]') AS content
FROM public.reviews
WHERE jsonb_path_query_first(data->'gallery', '$.id') ? '1cb158af-0983-4bac-9e4f-0274b3836cdd'
With this other query I get the first object:
SELECT jsonb_path_query_first(data->'gallery', '$[*]') AS content
FROM public.reviews
WHERE jsonb_path_query_first(data->'gallery', '$.id') ? '1cb158af-0983-4bac-9e4f-0274b3836cdd'
But filtering by the second array object id, I get no result:
SELECT jsonb_path_query_first(data->'gallery', '$[*]') AS content
FROM public.reviews
WHERE jsonb_path_query_first(data->'gallery', '$.id') ? 'aae64f19-22a8-4da7-b40a-fbbd8b2ef30b'
I have read the official documentation and tried other functions like jsonb_path_exists or jsonb_path_match on the where condition but was not able to make the query work.
Any help would be greatly appreciated. Thanks in advance.
I managed to get the query working as needed. Here is my proposal:
SELECT gallery
FROM public.reviews, jsonb_path_query(data->'gallery', '$[*]') as gallery
WHERE data->>'id' = '87b05c62-4153-4341-9b58-e86bade25ffd' and gallery->>'id' = 'aae64f19-22a8-4da7-b40a-fbbd8b2ef30b'
Hope it helps others.

How to extract the field from JSON object with QueryRecord

I have been struggling with this problem for a long time. I need to create a new JSON flowfile using QueryRecord by taking an array (field ref) from input JSON field refs and skip the object field as shown in example below:
Input JSON flowfile
{
"name": "name1",
"desc": "full1",
"refs": {
"ref": [
{
"source": "source1",
"url": "url1"
},
{
"source": "source2",
"url": "url2"
}
]
}
}
QueryRecord configuration
JSONTreeReader setup as Infer Schema and JSONRecordSetWriter
select name, description, (array[rpath(refs, '//ref[*]')]) as sources from flowfile
Output JSON (need)
{
"name": "name1",
"desc": "full1",
"references": [
{
"source": "source1",
"url": "url1"
},
{
"source": "source2",
"url": "url2"
}
]
}
But got error:
QueryRecord Failed to write MapRecord[{references=[Ljava.lang.Object;#27fd935f, description=full1, name=name1}] with schema ["name" : "STRING", "description" : "STRING", "references" : "ARRAY[STRING]"] as a JSON Object due to java.lang.ClassCastException: null
Try the following approach, in your case it shoud work:
1) Read your JSON field fully (I imitated it with GenerateFlowFile processor with your example)
2) Add EvaluateJsonPath processor which will put 2 header fileds (name, desc) into the attributes:
3) Add SplitJson processor which will split your JSON byt refs/ref/ groups (split by "$.refs.ref"):
4) Add ReplaceText processor which will add you header fields (name, desc) to the split lines (replace "[{]" value with "{"name":"${json.name}","desc":"${json.desc}","):
5) It`s done:
Full process in my demo case:
Hope this helps.
Solution!: use JoltTransformJSON to transform JSON by Jolt specification. About this specification.

Returning unknown JSON in a query

Here is my scenario. I have data in a Cosmos DB and I want to return c.this, c.that etc as the indexer for Azure Cognitive Search. One field I want to return is JSON of an unknown structure. The one thing I do know about it is that it is flat. However it is my understanding that the return value for an indexer needs to be known. How, using SQL in a SELECT, would I return all JSON elements in the flat object? Here is an example value I would be querying:
{
"BusinessKey": "SomeKey",
"Source": "flat",
"id": "SomeId",
"attributes": {
"Source": "flat",
"Element": "element",
"SomeOtherElement": "someOtherElement"
}
}
So I would want my select to be maybe something like:
SELECT
c.BusinessKey,
c.Source,
c.id,
-- SOMETHING HERE TO LIST OUT ALL ATTRIBUTES IN THE JSON AS FIELDS IN THE RESULT
And I would want the result to be:
{
"BusinessKey": "SomeKey",
"Source": "flat",
"id": "SomeId",
"attributes": [{"Source":"flat"},{"Element":"element"},{"SomeOtherElement":"someotherelement"}]
}
Currently we are calling ToString on the c.attributes, which is the JSON of unknown structure but it is adding all the escape characters. When we want to search the index, we have to add all those escape characters and it's getting really unruly.
Is there a way to do this using SQL?
Thanks for any help!
You could use UDF in cosmos db sql.
UDF code:
function userDefinedFunction(object){
var returnArray = [];
for (var key in object) {
var map = {};
map[key] = object[key];
returnArray.push(map);
}
return returnArray;
}
Sql:
SELECT
c.BusinessKey,
c.Source,
c.id,
udf.test(c.attributes) as attributes
from c
Output:

SQL query as string in JSON file

It is possible to store SQL query in JSON array of objects?? Because when i have something like this:
[{
"id": "1",
"query": "SELECT ID FROM table"
},
{
"id": "2",
"query": "SELECT ID FROM table"
},
{
"id": "3",
"query": "SELECT USER FROM table"
}
]
JSON file in VSCode is ok no error it is getting nasty when i want to store complex queries with joins etc.
for example this query even if i format it correctly it will generate error in JSON file about formatting
(just example i not it is not valid)
SELECT user, id, , count(price) as numrev
FROM price
where id = 1 and user = 0
group by user, id, price
that it can't be stored in string
It is bit easy to do, but requires on extra step.
Simply convert/encode you raw SQL queries in base64 text.
Decode the text before you execute the queries in you code.
If the JSON file is created automatically by a program/code
All most all programming languages proved base64 encode / decode functions as part of the core if not download compatible package / library to achieve this automation
var queries = [{
"id": "1",
"query": "U0VMRUNUIElEIEZST00gdGFibGU="
},
{
"id": "2",
"query": "U0VMRUNUIElEIEZST00gdGFibGU="
},
{
"id": "3",
"query": "U0VMRUNUIFVTRVIgRlJPTSB0YWJsZQ=="
},
{
"id": "4",
"query": "U0VMRUNUIHVzZXIsIGlkLCAsIGNvdW50KHByaWNlKSBhcyBudW1yZXYKICBGUk9NIHByaWNlCiAgd2hlcmUgaWQgPSAxIGFuZCB1c2VyID0gMCAKICBncm91cCBieSB1c2VyLCBpZCwgcHJpY2U="
}
];
for (i = 0; i < queries.length; i++) {
console.log("id = " + queries[i].id + ", query = " + atob(queries[i].query));
}
when you parse you JSON array make sure to decode before you execute the SQL queries.
let me know this one helped you.. ☺☻☺
FYI , refer http://www.utilities-online.info/base64/
enter image description here

How to Query Cosmos DB graph by use of SQL CONTAINS

I have a Cosmo DB graph where I would like to access the 'name' field in an expression using the string matching CONTAINS in Cosmos DB. CONTAINS works at 1 level as in matching CONATINS
SELECT s.label, s.name FROM s WHERE CONTAINS(LOWER(s.name._value), "cara") AND s.label = "site"
I also tried with a UDF function
SELECT s.label, s.name FROM s WHERE(s.label = 'site' AND udf.strContains(s.name._value, '/cara/i'))
I don't get any hits or syntax errors from Cosmos DB even that should be at least one record in this example. Does anyone have a hint? Thanks in advance
[
{
"label": "site",
"name": [
{
"_value": "0315817 Caracol",
"id": "2e2f000d-2e0a-435a-b472-75d257236558"
}
]
},
{
"label": "site",
"name": [
{
"_value": "0315861 New Times",
"id": "48497172-1734-43d0-9866-51faf9f603ed"
}
]
}
]
I noticed that the name property is an array not an object.So, you need to use join in sql.
SELECT s.label, s.name , name._value FROM s
join name in s.name
where CONTAINS(LOWER(name._value), "cara") AND s.label = "site"
Output:
Hope it helps you.