Postgresql search if exists in nested jsonb - sql

I'm new with jsonb request and i got a problem. Inside an 'Items' table, I have 'id' and 'data' jsonb. Here is what can look like a data:
[
{
"paramId": 3,
"value": "dog"
},
{
"paramId": 4,
"value": "cat"
},
{
"paramId": 5,
"value": "fish"
},
{
"paramId": 6,
"value": "",
"fields": [
{
"paramId": 3,
"value": "cat"
},
{
"paramId": 4,
"value": "dog"
}
]
},
{
"paramId": 6,
"value": "",
"fields": [
{
"paramId": 5,
"value": "cat"
},
{
"paramId": 3,
"value": "dog"
}
]
}
]
The value in data is always an array with object inside but sometimes the object can have a 'fields' value with objects inside. It is maximum one level deep.
How can I select the id of the items which as for example an object containing "paramId": 3 and "value": "cat" and also have an object with "paramId": 5 and "value" LIKE '%ish%'.
I already have found a way to do that when the object is on level 0
SELECT i.*
FROM items i
JOIN LATERAL jsonb_array_elements(i.data) obj3(val) ON obj.val->>'paramId' = '3'
JOIN LATERAL jsonb_array_elements(i.data) obj5(val) ON obj2.val->>'paramId' = '5'
WHERE obj3.val->>'valeur' = 'cat'
AND obj5.val->>'valeur' LIKE '%ish%';
but I don't know how to search inside the fields array if fields exists.
Thank you in advance for you help.
EDIT:
It looks like my question is not clear. I will try to make it better.
What I want to do is to find all the 'item' having in the 'data' column objects who match my search criteria. This without looking if the objects are at first level or inside a 'fields' key of an object.
Again for example. This record should be selected if I search:
'paramId': 3 AND 'value': 'cat
'paramId': 4 AND 'value': LIKE '%og%'
the matching ones are in the 'fields' key of the object with 'paramId': 6 and I don't know how to do that.

This can be expressed using a JSON/Path expression without the need for unnesting everything
To search for paramId = 3 and value = 'cat'
select *
from items
where data #? '$[*] ? ( (#.paramId == 3 && #.value == "cat") || exists( #.fields[*] ? (#.paramId == 3 && #.value == "cat")) )'
The $[*] part iterates over all elements of the first level array. To check the elements in the fields array, the exists() operator is used to nest the expression. #.fields[*] iterates over all elements in the fields array and applies the same expression again. I don't see a way how repeating the values could be avoided though.
For a "like" condition, you can use like_regex:
select *
from items
where data #? '$[*] ? ( (#.paramId == 4 && #.value like_regex ".*og.*") || exists( #.fields[*] ? (#.paramId == 4 && #.value like_regex ".*og.*")) )'

For now I have found a solution but it is not really clean and I don't know how it will perform in production with 10M records.
SELECT i.id, i.data
FROM ( -- A;
select it.id, it.data, i as value
from items it,
jsonb_array_elements(it.data) i
union
select it.id, it.data, f as value
from items it,
jsonb_array_elements(it.data) i,
jsonb_array_elements(i -> 'fields') f
) as i
WHERE (i.value ->> 'paramId' = '5' -- B1;
AND i.value ->> 'value' LIKE '%ish%')
OR (i.value ->> 'paramId' = '3' -- B2;
AND i.value ->> 'value' = 'cat')
group by i.id, i.data
having COUNT(*) >= 2; -- C;
A: I "flatten" the first and second level (second level is in 'fields' key)
B1, B2: These are my search criteria
C: I make sure the fields have all the criteria matching. If 3 criteria --> COUNT(*) >=3
It really doesn't look clean to me. It is working for dev purpose but I think there is a better way to do it.
If somebody have an idea Big thanks to him/her!

Related

In BigQuery, how do I check if two ARRAY of STRUCTs are equal

I have a query that outputs two array of structs:
SELECT modelId, oldClassCounts, newClassCounts
FROM `xyz`
GROUP BY 1
How do I create another column that is TRUE if oldClassCounts = newClassCounts?
Here is a sample result in JSON:
[
{
"modelId": "FBF21609-65F8-4076-9B22-D6E277F1B36A",
"oldClassCounts": [
{
"id": "A041EBB1-E041-4944-B231-48BC4CCE025B",
"count": "33"
},
{
"id": "B8E4812B-A323-47DD-A6ED-9DF877F501CA",
"count": "82"
}
],
"newClassCounts": [
{
"id": "A041EBB1-E041-4944-B231-48BC4CCE025B",
"count": "33"
},
{
"id": "B8E4812B-A323-47DD-A6ED-9DF877F501CA",
"count": "82"
}
]
}
]
I want the equality column to be TRUE if oldClassCounts and newClassCounts are exactly the same like the output above.
Anything else should be false.
I would go about with this solution
#standardSQL
WITH xyz AS (
SELECT "FBF21609-65F8-4076-9B22-D6E277F1B36A" AS modelId,
[STRUCT("A041EBB1-E041-4944-B231-48BC4CCE025B" as id, "33" as count),
STRUCT("B8E4812B-A323-47DD-A6ED-9DF877F501CA" as id, "82" as count)] AS oldClassCounts,
[STRUCT("A041EBB1-E041-4944-B231-48BC4CCE025B" as id, "33" as count),
STRUCT("B8E4812B-A323-47DD-A6ED-9DF877F501CA" as id, "82" as count)] as newClassCounts),
o as (SELECT modelId, id, count, array_length(oldClassCounts) as len FROM xyz, UNNEST(oldClassCounts) as old_c),
n as (SELECT modelId, id, count, array_length(newClassCounts) as len FROM xyz, UNNEST(newClassCounts) as new_c),
uneq as (select * from o except distinct select * from n)
select xyz.*, IF(uneq.modelId is not null, false, true) as equal from xyz left join (select distinct modelId from uneq) uneq on xyz.modelId = uneq.modelId
It works regardless of the order or having duplicates within the arrays. The idea is that we treat each of the arrays as a separate temporary table removing all elements that exist in one but not the other (using except distinct) and then have an extra check for the length of the arrays in case there are duplicates e.g.
"FBF21609-65F8-4076-9B22-D6E277F1B36A" AS modelId,
[STRUCT("A041EBB1-E041-4944-B231-48BC4CCE025B" as id, "33" as count),
STRUCT("B8E4812B-A323-47DD-A6ED-9DF877F501CA" as id, "82" as count),
STRUCT("B8E4812B-A323-47DD-A6ED-9DF877F501CA" as id, "82" as count)]
I would consider comparing the result of TO_JSON_STRING function applied on both of these arrays.
In the query it would be done in the following way:
SELECT modelId,
oldClassCounts,
newClassCounts,
CASE WHEN TO_JSON_STRING(oldClassCounts) = TO_JSON_STRING(newClassCounts)
THEN true
ELSE false
END
FROM `xyz`;
I'm not sure about GROUP BY 1 part, because non of the fields are grouped or aggregated.
It is not going to work, if the order of elements in the array is going to be different. This solution is not perfect, but worked for the data you provided.

Get data from any items in array

PosgreSQL 9.5
Field type: jsonb
Here json
{
"options": [
{
"name": "method"
},
{
"name": "flavor"
},
{
"name": "weight",
"value": {
"name": "300g"
}
}
]
}
And here query that get value of item (weight) with index = 2 from array:
SELECT
id,
product.data #>'{title,en}' AS title_en,
product.data #>>'{options, 2, value, name }' as options_weight_value
FROM product
Nice. It's work fine.
But the problem that weight can be in any index in array. First or second and so on.
So I need to get value of name (300g) in node "weight" .
I need smt like this:
SELECT
id,
product.data #>'{title,en}' AS title_en,
product.data #>>'{options, *, value, name, weight }' as options_weight_value
FROM product
Is it possible ?
I think I found solution:
SELECT
id,
p.data #>'{title,en}' AS title_en,
p.data #>'{weight,qty}' AS weight_qty,
(select *
from jsonb_array_elements(p.data -> 'options') AS options_array
where
options_array ->> 'name' = 'weight'
) #>'{value,name}' as options_weight
from product p
And now find value of weight(if exist) in any array's item. In this example it = 300g

Postgres query to remove duplicates in multiple joined arrays using knex

I'm using knex to build a postgres query and have a table of recipes with a many to many relationship to both a table of ingredients and steps (each step being a part of an instruction). I'm trying to aggregate both the steps and ingredients into their own arrays within the query. My problem is as soon as I join the second array both arrays lose their distinctiveness (ie. table a has 2 elements, table b has 3 elements; after I join table b; both arrays now have 6 elements).
I've tried using distinct but every attempt has resulted in an error being thrown.
Here's what I'm trying to output:
"id": 1,
"title": "sometitle",
"ingredients": [
{
"ingredient": "avacado",
"quantity": 24
},
{
"ingredient": "asparagus",
"quantity": 42
},
],
"instructions": [
{
"step": 1,
"instruction": "one"
},
{
"step": 2,
"instruction": "two"
},
{
"step": 3,
"instruction": "three"
},
]
Here's what I have so far:
knex(`recipes as r`)
.where({'r.id': 1})
.join('ingredients_list as list', {'list.recipe_id': 'r.id'})
.join('ingredients', {'list.ingredient_id': 'ingredients.id'})
.join('instructions', {'instructions.recipe_id': 'r.id'})
.select(
'r.id',
db.raw(`json_agg(json_build_object(
'ingredient', ingredients.name,
'quantity', list.quantity
)) as ingredients`),
db.raw(`json_agg(json_build_object(
'step', instructions.step_number,
'instruction', instructions.description
)) as instructions`)
)
.groupBy('r.id')
.first()
Here's the solution I came up with in case anyone else runs into this issue. I assume this works because postgres is unable to evaluate equality of json objects; whereas jsonb is a binary object. I'd love a more thorough explanation of this is somebody has one.
distinct json_agg(jsonb_build_object(...))
knex(`recipes as r`)
.where({'r.id': 1})
.join('ingredients_list as list', {'list.recipe_id': 'r.id'})
.join('ingredients', {'list.ingredient_id': 'ingredients.id'})
.join('instructions', {'instructions.recipe_id': 'r.id'})
.select(
'r.id',
db.raw(`distinct json_agg(jsonb_build_object(
'ingredient', ingredients.name,
'quantity', list.quantity
)) as ingredients`),
db.raw(`distinct json_agg(jsonb_build_object(
'step', instructions.step_number,
'instruction', instructions.description
)) as instructions`)
)
.groupBy('r.id')
.first()

How to query on multiple attributes in the same json object array?

I have a json array similar to this structure in a column of my database -
{
"id": "123abc",
"Y/N": "Y",
"Color": "Purple",
"arr": [ {
"time": 1210.55
"person": "Sean"
"action": "yes" //The values for this field can only be 'yes', 'no, 'maybe'
},
{
"time": 1230.19
"person": "Linda"
"action": "no"
} ],
}
I need to pull all the corresponding attributes based on 2 criteria of an object in the "arr" array. I want to get the latest "arr" object based on the "time" (highest value) but only pull this index if the "action" is equal to 'no' or 'yes', so exclude all the objects when "action" = "maybe".
I have tried using a WHERE statement to have a "time" range set and ORDER BY DESC to pull the latest entry and return the entire "arr". This just returns the highest value of "time" but returns all the attributes when "action" = "maybe" but I want to return the objects with only "yes" or "no".
Here is the current query I have -
SELECT jsonb_build_object('ID', t.col -> '_id',
'Yes or No', t.col -> 'Y/N',
'arr', x.elem)
FROM tbl t
CROSS JOIN LATERAL (
SELECT elem
FROM jsonb_array_elements(t.col -> 'arr') a(elem)
WHERE a.elem -> 'time' between '1110.23' and '1514.12'
AND t.col ->> 'Color' = 'Purple'
ORDER BY a.elem -> 'time' DESC NULLS LAST
LIMIT 1
) x;
The query is returning the latest object in the array with the highest time but it is also returning objects when "action" = "maybe". I have tried adding AND a.elem -> 'action' = 'yes' after the WHERE statement but receive an error saying the Token "yes" is invalid.
Is it possible to return an object with the largest "time" that has the "action" attribute equal to "yes" or "no" only?
Your code is alternatively using -> and ->>, but it isn't using them correctly.
Lots of good details here: What is the difference between `->>` and `->` in Postgres SQL?.
Everywhere in the code using -> should be ->> if you try that in the where clause with AND a.elem ->> 'action' = 'yes' it should work.

jsonb LIKE query on nested objects in an array

My JSON data looks like this:
[{
"id": 1,
"payload": {
"location": "NY",
"details": [{
"name": "cafe",
"cuisine": "mexican"
},
{
"name": "foody",
"cuisine": "italian"
}
]
}
}, {
"id": 2,
"payload": {
"location": "NY",
"details": [{
"name": "mbar",
"cuisine": "mexican"
},
{
"name": "fdy",
"cuisine": "italian"
}
]
}
}]
given a text "foo" I want to return all the tuples that have this substring. But I cannot figure out how to write the query for the same.
I followed this related answer but cannot figure out how to do LIKE.
This is what I have working right now:
SELECT r.res->>'name' AS feature_name, d.details::text
FROM restaurants r
, LATERAL (SELECT ARRAY (
SELECT * FROM json_populate_recordset(null::foo, r.res#>'{payload,
details}')
)
) AS d(details)
WHERE d.details #> '{cafe}';
Instead of passing the whole text of cafe I want to pass ca and get the results that match that text.
Your solution can be simplified some more:
SELECT r.res->>'name' AS feature_name, d.name AS detail_name
FROM restaurants r
, jsonb_populate_recordset(null::foo, r.res #> '{payload, details}') d
WHERE d.name LIKE '%oh%';
Or simpler, yet, with jsonb_array_elements() since you don't actually need the row type (foo) at all in this example:
SELECT r.res->>'name' AS feature_name, d->>'name' AS detail_name
FROM restaurants r
, jsonb_array_elements(r.res #> '{payload, details}') d
WHERE d->>'name' LIKE '%oh%';
db<>fiddle here
But that's not what you asked exactly:
I want to return all the tuples that have this substring.
You are returning all JSON array elements (0-n per base table row), where one particular key ('{payload,details,*,name}') matches (case-sensitively).
And your original question had a nested JSON array on top of this. You removed the outer array for this solution - I did the same.
Depending on your actual requirements the new text search capability of Postgres 10 might be useful.
I ended up doing this(inspired by this answer - jsonb query with nested objects in an array)
SELECT r.res->>'name' AS feature_name, d.details::text
FROM restaurants r
, LATERAL (
SELECT * FROM json_populate_recordset(null::foo, r.res#>'{payload, details}')
) AS d(details)
WHERE d.details LIKE '%oh%';
Fiddle here - http://sqlfiddle.com/#!15/f2027/5