query for many to many record matching - sql

I have table tag_store like below
I want to filter the ids which has all tag provided in a like
SELECT st.id from public."tag_store" st
inner join
(SELECT x.tg_type,x.tg_value FROM json_to_recordset
('[{ "tg_type":1, "tg_value ":"cd"},{ "tg_type":2,"tg_value ":"tg"},{ "tg_type":3,"tg_value ":"po" }] '::json)
AS x (tg_type int, tg_value TEXT)) ftg
on st.tg_type= ftg.tg_type
and st.tg_value = ftg.tg_value order by st.id;
My desired output is it should have output onlye id 1 as it has all three tg_value and tg_id matched..
Please help, what should I change, or is there any better alternate
Thanks

I would aggregate the values into a JSON array and use the #> operator to filter those that have all of them:
with tags as (
select id, jsonb_agg(jsonb_build_object('tg_id', tag_id, 'tg_value', tag_value)) all_tags
from tag_store
group by id
)
select *
from tags
where all_tags #> '[{"tg_id":1, "tg_value": "cd"},
{"tg_id":2, "tg_value": "tg"},
{"tg_id":3, "tg_value": "po"}]'
;
Online example
You can also do that directly in a HAVING clause if you want
select id
from tag_store
group by id
having jsonb_agg(jsonb_build_object('tg_id', tag_id, 'tg_value', tag_value))
#> '[{"tg_id":1, "tg_value": "cd"},
{"tg_id":2, "tg_value": "tg"},
{"tg_id":3, "tg_value": "po"}]'
;
Note that this will return IDs that have additional tags apart from those in the comparison array.

Related

Array field in SQL BigQuery return error when to filter

I have the follow query in BigQuery:
SELECT *
FROM `data`, UNNEST(deliveries.modalities.campaigns) as dmc
where
dmc.id = 4469
The struct of the field deliveries is:
deliveries RECORD REPEATED
-----items RECORD REPEATED
-----modalities RECORD REPEATED
----------campaigns RECORD REPEATED
---------------coparticipations RECORD REPEATED
---------------id
I wnat to filter deliveries.modalities.campaigns.id, but my query don't worked. Can anyone help me?
Another approach that you might try and consider is to use CTE as shown below:
with test_1 as (
select *
from `your-project.your-dataset.test_deliveries`, unnest (deliveries) d
JOIN unnest (d.modalities) m
)
select *
from test_1, unnest (campaigns) c
where c.id = 4469
Output:
My Test Schema:
My loaded .jsonl file to create my sample data:
{"deliveries": [{"items": [1,2,3],"modalities": [{"campaigns": [{"coparticipations": [1,2,3],"id": 1234}]}]}]}
{"deliveries": [{"items": [2,3,4],"modalities": [{"campaigns": [{"coparticipations": [4,5,6],"id": 2345}]}]}]}
{"deliveries": [{"items": [3,4,5],"modalities": [{"campaigns": [{"coparticipations": [7,8,9],"id": 4469}]}]}]}
{"deliveries": [{"items": [4,5,6],"modalities": [{"campaigns": [{"coparticipations": [10,11,12],"id": 3456}]}]}]}

Postgres - order records based on a property inside an array of json objects

I'm working with a Postgres database and I have a products view like this:
id
name
product_groups
1
product1
[{...}]
2
product2
[{...}]
the product_groups field contains an array of json objects with the product groups data that the product belongs to, where each json object has the following structure:
{
"productGroupId": 1001,
"productGroupName": "Microphones"
"orderNo": 1,
}
I have a query to get all the products that belong to certain group:
SELECT * FROM products p WHERE p.product_groups #> [{"productGroupId": 1001}]
but I want to get all the products ordered by the orderNo property of the group that I'm querying for.
what should I add/modify to my query in order to achieve this?
I am not really sure I understand your question. My assumptions are:
there will only be one match for the condition on product groups
you want to sort the result rows from the products table, not the elements of the array.
If those two assumptions are correct, you can use a JSON path expression to extract the value of orderNo and then sort by it.
SELECT p.*
FROM products p
WHERE p.product_groups #> [{"productGroupId": 1001}]
ORDER BY jsonb_path_query_first(p.product_groups, '$[*] ? (#.productGroupId == 1001).orderNo')::int
You have to unnest the array:
SELECT p.*
FROM products AS p
CROSS JOIN LATERAL jsonb_array_elements(p.product_groups) AS arr(elem)
WHERE arr.elem #> '{"productGroupId": 1001}'
ORDER BY CAST(arr.elem ->> 'orderNo' AS bigint);

How to group results by values that are inside json array in postgreSQL

I have a column of type jSONB that have data like this:
column name: used_filters
row number 1 example:
{ "categories" : ["economic", "Social"], "tags": ["world" ,"eco-friendly"] }
row number 2 example:
{ "categories" : ["economic"], "tags": ["eco-friendly"] , "keywords" : ["2050"] }
I want to group the result to get the most frequent value for each one of the keys
something like this:
key
most_freq
category
economic
tags
eco-friendly
keyword
2050
the keys are not constant and could be something other than the example I said but I know that they will be frequent.
You can extract keys and values as arrays first by using jsonb_each, and then unnest the generated arrays by jsonb_array_elements_text. The rest is classical aggregation along with sorting through the count values by window function such as
SELECT key, value
FROM ( SELECT j.key, jj.value,
RANK() OVER (PARTITION BY j.key ORDER BY COUNT(*) DESC)
FROM t,
LATERAL jsonb_each(js) AS j,
LATERAL jsonb_array_elements_text(j.value) AS jj
GROUP BY j.key, jj.value ) AS q
WHERE rank = 1
Demo

How to group by more than one row value?

I am working with POSTGRESQL and I can't find out how to solve a problem. I have a model called Foobar. Some of its attributes are:
FOOBAR
check_in:datetime
qr_code:string
city_id:integer
In this table there is a lot of redundancy (qr_code is not unique) but that is not my problem right now. What I am trying to get are the foobars that have same qr_code and have been in a well known group of cities, that have checked in at different moments.
I got this by querying:
SELECT * FROM foobar AS a
WHERE a.city_id = 1
AND EXISTS (
SELECT * FROM foobar AS b
WHERE a.check_in < b.check_in
AND a.qr_code = b.qr_code
AND b.city_id = 2
AND EXISTS (
SELECT * FROM foobar as c
WHERE b.check_in < c.check_in
AND c.qr_code = b.qr_code
AND c.city_id = 3
AND EXISTS(...)
)
)
where '...' represents more queries to get more persons with the same qr_code, different check_in date and those well known cities.
My problem is that I want to group this by qr_code, and I want to show the check_in fields of each qr_code like this:
2015-11-11 14:14:14 => [2015-11-11 14:14:14, 2015-11-11 16:16:16, 2015-11-11 17:18:20] (this for each different qr_code)
where the data at the left is the 'smaller' date for that qr_code, and the right part are all the other dates for that qr_code, including the first one.
Is this possible to do with a sql query only? I am asking this because I am actually doing this app with rails, and I know that I can make a different approach with array methods of ruby (a solution with this would be well received too)
You could solve that with a recursive CTE - if I interpret your question correctly:
Assuming you have a given list of cities that must be visited in order by the same qr_code. Your text doesn't say so, but your query indicates as much.
WITH RECURSIVE
c AS (SELECT '{1,2,3}'::int[] AS cities) -- your list of city_id's here
, route AS (
SELECT f.check_in, f.qr_code, 2 AS idx
FROM foobar f
JOIN c ON f.city_id = c.cities[1]
UNION ALL
SELECT f.check_in, f.qr_code, r.idx + 1
FROM route r
JOIN foobar f USING (qr_code)
JOIN c ON f.city_id = c.cities[r.idx]
WHERE r.check_in < f.check_in
)
SELECT qr_code, array_agg(check_in) AS check_in_list
FROM (
SELECT *
FROM route
ORDER BY qr_code, idx -- or check_in
) sub
HAVING count(*) = (SELECT array_length(cities) FROM c);
GROUP BY 1;
Provide the list as array in the first (non-recursive) CTE c.
In the recursive part start with any rows in the first city and travel along your array until the last element.
In the final SELECT aggregate your check_in column in order. Only return qr_code that have visited all cities of the array.
Similar:
Recursive query used for transitive closure

Set limit to array_agg()

I have the following Postgres query:
SELECT array_agg("Esns".id )
FROM public."Esns",
public."PurchaseOrderItems"
WHERE
"Esns"."PurchaseOrderItemId" = "PurchaseOrderItems".id
AND "PurchaseOrderItems"."GradeId"=2
LIMIT 2;
The limit will affect the rows. I want it to limit the array_agg() to 2 items. The following query works but I get my output with each entry in quotes:
SELECT array_agg ("temp")
FROM (
SELECT "Esns".id
FROM public."Esns",
public."PurchaseOrderItems"
WHERE
"Esns"."PurchaseOrderItemId" = "PurchaseOrderItems".id
AND "PurchaseOrderItems"."GradeId"=2
LIMIT 4
) as "temp" ;
This give me the following output
{(13),(14),(15),(12)}
Any ideas?
select id[1], id[2]
from (
SELECT array_agg("Esns".id ) as id
FROM public."Esns",
public."PurchaseOrderItems"
WHERE
"Esns"."PurchaseOrderItemId" = "PurchaseOrderItems".id
AND "PurchaseOrderItems"."GradeId"=2
) s
or if you want the output as array you can slice it:
SELECT (array_agg("Esns".id ))[1:2] as id_array
FROM public."Esns",
public."PurchaseOrderItems"
WHERE
"Esns"."PurchaseOrderItemId" = "PurchaseOrderItems".id
AND "PurchaseOrderItems"."GradeId"=2
The parentheses (not "quotes") in the result are decorators for the row literals. You are building an array of whole rows (which happen to contain only a single column). Instead, aggregate only the column.
Also, direct array construction from a query result is typically simpler and faster:
SELECT ARRAY (
SELECT e.id
FROM public."Esns" e
JOIN public."PurchaseOrderItems" p ON p.id = e."PurchaseOrderItemId"
WHERE p."GradeId" = 2
-- ORDER BY ???
LIMIT 4 -- or 2?
)
You need to ORDER BY something if you want a stable result and / or pick certain rows. Otherwise the result is arbitrary and can change with every next call.
While being at it I rewrote the query with explicit JOIN syntax, which is generally preferable, and used table aliases to simplify.