Snowflake JSON FLATTEN with ORDER BY - sql

I have a working query that flattens a nested JSON object into rows of data. What I would like to do, however, is preserve the original order of one array of objects which is nested several layers in.
I have tried to use ROW_NUMBER with an ORDER BY NULL and an ORDER BY (SELECT NULL) and neither seem to preserve the order.
Any ideas on how to accomplish that? Examples below. I chose to mask the real data, but the important parts of the structure are the same. The data in JSON format comes through with no rank-identifying information, but I used numbers as examples here to show the strange results.
Original structure (masked):
{
"topNode: {
"childNode": {
"list": [
{
"title": "example title 1",
},
{
"title": "example title 2",
},
{
"title": "example title 3",
},
{
"title": "example title 4",
},
{
"title": "example title 5",
}
]
}
}
}
Example query (masked):
SELECT
A.VALUE:"title"::VARCHAR AS "TITLE",
ROW_NUMBER() OVER(ORDER BY NULL) AS RANK
FROM
DB.SCHEMA.TABLE as A,
lateral flatten(input=>A.JSON:topNode.childNode.list) "list_flatten"
Example output:
TITLE RANK
"example title 3" 1
"example title 5" 2
"example title 2" 3
"example title 1" 4
"example title 4" 5

It is possible with INDEX, which returns index of element in array:
SELECT A.VALUE:"title"::VARCHAR AS "TITLE",
"list_flatten".index AS "RANK"
FROM DB.SCHEMA.TABLE as A,
lateral flatten(input=>A.JSON:topNode.childNode.list) "list_flatten"

Related

Solr Nested Documents: query for parent document which has several specific nested documents

Let's imagine I have document with several nested documents:
{
"id": "doc1",
"type": "maindoc",
"title": "some document 1 title"
"nested": [
{
"id": "nested1",
"nested_type": "nestedType1",
"title": "nested doc 1 title"
},
{
"id": "nested2",
"nested_type": "nestedType2",
"title": "nested doc 2 title"
},
{
"id": "nested3",
"nested_type": "nestedType3",
"title": "nested doc 3 title"
}
]
}
So now if I want to search for document which has nested doc 1 - I do this:
{!parent which='type:maindoc'}
nested_type:nestedType1
But what if I want to search for document which has 2 specific children at the same time?
For example I want to find doc which has both nestedType1 + nestedType2.
Obviously query like this will not work:
{!parent which='type:maindoc'}
nested_type:nestedType1 AND nested_type:nestedType2
So how can I do that? Is that possible at all?
Something like this did the trick in my testing:
({!parent which='type:maindoc' v='nested_type:nestedType1'}) AND ({!parent which='type:maindoc' v='nested_type:nestedType2'})

Get data from any items in array

PosgreSQL 9.5
Field type: jsonb
Here json
{
"options": [
{
"name": "method"
},
{
"name": "flavor"
},
{
"name": "weight",
"value": {
"name": "300g"
}
}
]
}
And here query that get value of item (weight) with index = 2 from array:
SELECT
id,
product.data #>'{title,en}' AS title_en,
product.data #>>'{options, 2, value, name }' as options_weight_value
FROM product
Nice. It's work fine.
But the problem that weight can be in any index in array. First or second and so on.
So I need to get value of name (300g) in node "weight" .
I need smt like this:
SELECT
id,
product.data #>'{title,en}' AS title_en,
product.data #>>'{options, *, value, name, weight }' as options_weight_value
FROM product
Is it possible ?
I think I found solution:
SELECT
id,
p.data #>'{title,en}' AS title_en,
p.data #>'{weight,qty}' AS weight_qty,
(select *
from jsonb_array_elements(p.data -> 'options') AS options_array
where
options_array ->> 'name' = 'weight'
) #>'{value,name}' as options_weight
from product p
And now find value of weight(if exist) in any array's item. In this example it = 300g

Query for entire JSON document in nested JSON schema

Background:
I wish to locate the entire JSON document that has a condition where "state" = "new" and where length(Features.id) > 4
{
"id": "123"
"feedback": {
"Features": [
{
"state": "new"
"id": "12345"
}
]
}
}
This is what I have tried to do:
Since this is a nested document. My query looks like this:
A stackoverflow member has helped me to access the nested contents within the query, but is there a way to obtain the full document
I have used:
SELECT VALUE t.id FROM t IN f.feedback.Features where t.state = 'new' and length(t.id)>4
This will give me the ids.
My desire is to have access to the full document with this condition?
{
"id": "123"
"feedback": {
"Features": [
{
"state": "new"
"id": "12345"
}
]
}
}
Any help is appreciated
Try this
SELECT *
FROM f
WHERE
f.feedback.Features[0].state = 'new'
AND length(f.feedback.Features[0].id)>4
Here is the SELECT spec for CosmosDB for more details
https://learn.microsoft.com/en-us/azure/cosmos-db/sql-query-select
Also, check out "working with JSON" in CosmosDB notes
https://learn.microsoft.com/en-us/azure/cosmos-db/sql-query-working-with-json
If the Features array has more than 1 value, you can use EXISTS clause to search within them. See specs of EXISTS here with examples:
https://learn.microsoft.com/en-us/azure/cosmos-db/sql-query-subquery#exists-expression

How to "zip" multiple nested JSON arrays without using id key?

I'm trying to merge some nested JSON arrays without looking at the id. Currently I'm getting this when I make a GET request to /surveyresponses:
{
"surveys": [
{
"id": 1,
"name": "survey 1",
"isGuest": true,
"house_id": 1
},
{
"id": 2,
"name": "survey 2",
"isGuest": false,
"house_id": 1
},
{
"id": 3,
"name": "survey 3",
"isGuest": true,
"house_id": 2
}
],
"responses": [
{
"question": "what is this anyways?",
"answer": "test 1"
},
{
"question": "why?",
"answer": "test 2"
},
{
"question": "testy?",
"answer": "test 3"
}
]
}
But I would like to get it where each survey has its own question and answers so something like this:
{
"surveys": [
{
"id": 1,
"name": "survey 1",
"isGuest": true,
"house_id": 1
"question": "what is this anyways?",
"answer": "test 1"
}
]
}
Because I'm not going to a specific id I'm not sure how to make the relationship work. This is the current query I have that's producing those results.
export function getSurveyResponse(id: number): QueryBuilder {
return db('surveys')
.join('questions', 'questions.survey_id', '=', 'surveys.id')
.join('questionAnswers', 'questionAnswers.question_id', '=', 'questions.id')
.select('surveys.name', 'questions.question', 'questions.question', 'questionAnswers.answer')
.where({ survey_id: id, question_id: id })
}
Assuming jsonb in current Postgres 10 or 11, this query does the job:
SELECT t.data, to_jsonb(s) AS new_data
FROM t
LEFT JOIN LATERAL (
SELECT jsonb_agg(s || r) AS surveys
FROM (
SELECT jsonb_array_elements(t.data->'surveys') s
, jsonb_array_elements(t.data->'responses') r
) sub
) s ON true;
db<>fiddle here
I unnest both nested JSON arrays in parallel to get the desired behavior of "zipping" both directly. The number of elements in both nested JSON arrays has to match or you need to do more (else you lose data).
This builds on implementation details of how Postgres deals with multiple set-returning functions in a SELECT list to make it short and fast. See:
What is the expected behaviour for multiple set-returning functions in select clause?
One could be more explicit with a ROWS FROM expression, which works properly since Postgres 9.4:
SELECT t.data
, to_jsonb(s) AS new_data
FROM tbl t
LEFT JOIN LATERAL (
SELECT jsonb_agg(s || r) AS surveys
FROM ROWS FROM (jsonb_array_elements(t.data->'surveys')
, jsonb_array_elements(t.data->'responses')) sub(s,r)
) s ON true;
The manual about combining multiple table functions.
Or you could use WITH ORDINALITY to get original order of elements and combine as you wish:
PostgreSQL unnest() with element number

Postgresql SELECTing from JSON column

Assume I am using PG 9.3 and I have a post table with a json column 'meta_data':
Example content of the json column 'meta_data'
{
"content": "this is a post body",
"comments": [
{
"user_id": 1,
"content": "hello"
},
{
"user_id": 2,
"content": "foo"
},
{
"user_id": 3,
"content": "bar"
}
]
}
How can I find all the posts where the user_id = 1 from the comments array from the meta_data column?
I'm almost positive I'm implementing this incorrectly but try this
select *
from posts
where id in (
select id from (
select id,
json_array_elements(meta_data->'comments')->'user_id' as user_id
from posts
) x
where cast(user_id as varchar) = '1'
);
There's probably an array operator like #> that will remove the need for the nested select statements but I can't seem to get it to work right now.
Let me know if this is going down the correct track, I'm sure we could figure it out if required.