Postgresql: select rows by OR condition - including going through json array - sql

I have a table created by following query:
create table data
(
id integer not null unique,
owner text,
users jsonb not null
);
The table looks like this:
+----+-------+---------------------------------------------+
| id | owner | users |
+----+-------+---------------------------------------------+
| 1 | alice | [] |
| 2 | bob | [{"accountId": "alice", "role": "manager"}] |
| 3 | john | [{"accounId": "bob", "role": "guest"}] |
+----+-------+---------------------------------------------+
I need to get rows 1 and 2 on behalf of Alice.
Getting owner-based rows works perfect:
SELECT *
FROM data
WHERE owner = 'alice'
Getting jsonb-based rows is a little trickier though managable:
SELECT *
FROM data, jsonb_array_elements(users) x
WHERE (x ->> 'accountId') = 'alice'
But getting them together gets me just the jsonb-based ones:
SELECT *
FROM data, jsonb_array_elements(users) x
WHERE owner = 'alice' OR (x ->> 'accountId') = 'alice'
How do I get the selection that looks like following?
+----+-------+---------------------------------------------+
| id | owner | users |
+----+-------+---------------------------------------------+
| 1 | alice | [] |
| 2 | bob | [{"accountId": "alice", "role": "manager"}] |
+----+-------+---------------------------------------------+
Even better if I can get a selection that looks like this
+----+----------+
| id | role |
+----+----------+
| 1 | owner |
| 2 | manager |
+----+----------+

The problem is with the empty json array, which evicts the corresponding row from the result set when cross joined with jsonb_array_elements(). Instead, you can make a left join lateral:
select d.*
from data d
left join lateral jsonb_array_elements(d.users) as x(js) on 1 = 1
where 'alice' in (d.owner, x.js ->> 'accountId')
Note that, if your array always contains 0 or 1 element, tyou don't need the lateral join - your query would be simpler phrased as:
select d.*
from data d
where 'alice' in (d.owner, d.data -> 0 ->> 'accountId')
Demo on DB Fiddle - both queries return:
id | owner | users
-: | :---- | :------------------------------------------
1 | alice | []
2 | bob | [{"role": "manager", "accountId": "alice"}]

Related

Postgres: How to join table with values from jsonb[] column

I have two tables as follows
accounts
------------------------------------------
| ID | LOCATIONS |
|------------------------------------------|
| 1 | [{ "id" : 1}, { "id" : 3 }] |
|------------------------------------------|
| 2 | [] |
------------------------------------------
regions
----------------------------
| ID | DATA |
|---------------------------|
| 1 | {"name": "South"} |
|---------------------------|
| 2 | {"name": "West"} |
|---------------------------|
| 3 | {"name": "North"} |
|---------------------------|
| 4 | {"name": "East"} |
---------------------------
locations is of type jsonb[]
Now I wanted to get result as follows
------
| NAME |
|------|
| South|
|------|
| North|
------
Please help with the postgresql query to get this.
Edited for jsonb[] type:
Demo
select
r.data ->> 'name' as name
from
accounts a
cross join unnest(a.locations) al
inner join regions r on r.id = (al ->> 'id')::int
P.S: for jsonb type:
You can use jsonb_to_recordset function and CROSS JOIN to join JSON array record with table.
Demo
select
r.data ->> 'name' as name
from
accounts a
cross join jsonb_to_recordset(a.locations) as al(id int)
inner join regions r on r.id = al.id
One option would be using JSONB_ARRAY_ELEMENTS() along with cross joins such as
SELECT r.data->>'name' AS "Name"
FROM accounts AS a,
regions AS r,
JSONB_ARRAY_ELEMENTS(a.locations) AS l
WHERE (value->>'id')::INT = r.id
Demo
PS. if the data type of locations is JSON rather than JSONB, then just replace the current function with JSON_ARRAY_ELEMENTS()

How to join with aggregate arrays on value in jsonb array in postgres?

Games
+----------------+--------+-------+------+
| Title | GameId | Genre | Tag |
+----------------+--------+-------+------+
| Aeon | A1 | RPG | B1 |
| Questerra | A2 | RPG | B2 |
| Age of Thunder | A3 | RPG | B3 |
+----------------+--------+-------+------+
Items
+-----------+----------------------------------------------------------------+
| Type | Objects |
+-----------+----------------------------------------------------------------+
| Longsword | {type: 'weapon', game_ids: ['A1', 'A3'], tag_ids: ['B1']} |
| Scimitar | {type: 'weapon', game_ids: ['A2'], tag_ids: ['B2', 'B3']} |
| Longbow | {type: 'weapon': game_ids: ['A1', 'A2'], tag_ids: ['B2', 'B3'} |
+-----------+----------------------------------------------------------------+
I have tables similar to the above. Columns GameIds and TagIds are both jsonb types that contain arrays ids. What I would like to do is return an array of Type along with table Games where either a GameId or Tag is in Objects.game_ids or Objects.tag_ids respectively. I sort of have an idea of how this is supposted to work. I think it's something like
SELECT ARRAY_AGG(it.Type) Types, g.*
FROM Games as g
LEFT JOIN Items as it
ON TRUE
WHERE (
it.Objects::jsonb #> '{game_ids}' ? g.GameId::text
OR
it.Objects::jsonb #> '{tag_ids}' ? g.Tag::text
);
but this query executes and never resolves. There's no indication of an error, and I suspect that it's either not doing what I expect it to or is just insanely inefficient. What should this query look like?
You can use a subquery with array_to_json:
select g.*, (select array_to_json(array_agg(i.type))
from items i
where exists (select 1 from jsonb_array_elements(i.objects -> 'game_ids') v where v.value::text = concat('"', g.gameid, '"'))
or exists (select 1 from jsonb_array_elements(i.objects -> 'tag_ids') v where v.value::text = concat('"', g.tag, '"')))
from games g;

How to duplicate and merge rows through select query

I've an existing Postgresql select query output that gives me,
+------+------------+------+------+
| Type | ID | Pass | Fail |
+------+------------+------+------+
| IQC | ABC_IQC_R2 | 0 | 6 |
+------+------------+------+------+
| IQC | ABC_IQC_R1 | 2 | 6 |
+------+------------+------+------+
| IQC | ABC_IQC | 498 | 8 |
+------+------------+------+------+
How do I duplicate the row of ID-> ABC_IQC into two while merging both R1 & R2 values into that row? (As shown below)
+------+---------+------------+------+------+--------+--------+
| Type | ID | R_ID | Pass | Fail | R_Pass | R_Fail |
+------+---------+------------+------+------+--------+--------+
| IQC | ABC_IQC | ABC_IQC_R2 | 498 | 8 | 0 | 6 |
+------+---------+------------+------+------+--------+--------+
| IQC | ABC_IQC | ABC_IQC_R1 | 498 | 8 | 2 | 6 |
+------+---------+------------+------+------+--------+--------+
The two logics I can think of is,
Run through the ID to search for ABC (But I'm unsure of how to match them). Duplicate the row ABC_IQC & then merge them using Lateral Join (Still unsure how)
Duplicate a column for ABC_IQC(ID column) from both R2 & R1 (now becoming R_ID). Search ID for the original ABC_IQC row and extract the value of pass and fail into both R2 & R1 row.
Here is my current query to get the initial query output,
SELECT
split_part(NewLotID, '_', 2) AS "Type",
LotSummary ->> 'ID' AS "ID",
LotSummary ->> 'Pass' AS "Pass",
LotSummary ->> 'Fail' AS "Fail"
FROM
(
SELECT
LotSummary,
regexp_replace(LotSummary ->> 'ID','[- ]','_','g') AS NewLotID
.
.
.
I'm not expecting a full answer because I've hardly provided any code, just any ideas or insights that might be helpful! Thank you in advance.
I think you want join:
with q as (
<your query here>
)
select q.type, q.id, qr.id as r_id, q.pass, q.fail,
qr.pass as r_pass, qr.fail as r_fail
from q join
q qr
on q.id = 'ABC_IQC' and qr.id like 'ABC_IQC_%';
You can actually generalize this:
with q as (
<your query here>
)
select q.type, q.id, qr.id as r_id, q.pass, q.fail,
qr.pass as r_pass, qr.fail as r_fail
from q join
q qr
on q.id ~ '^[^_]+_[^_]+$' and
qr.id like q.id || '_%';

How to select table with a concatenated column?

I have the following data:
select * from art_skills_table;
+----+------+---------------------------+
| ID | Name | skills |
+----+------+---------------------------|
| 1 | Anna | ["painting","photography"]|
| 2 | Bob | ["drawing","sculpting"] |
| 3 | Cat | ["pastel"] |
+----+------+---------------------------+
select * from computer_table;
+------+------+-------------------------+
| ID | Name | skills |
+------+------+-------------------------+
| 1 | Anna | ["word","typing"] |
| 2 | Cat | ["code","editing"] |
| 3 | Bob | ["excel","code"] |
+------+------+-------------------------+
I would like to write an SQL statement which results in the following table.
+------+------+-----------------------------------------------+
| ID | Name | skills |
+------+------+-----------------------------------------------+
| 1 | Anna | ["painting","photography","word","typing"] |
| 2 | Bob | ["drawing","sculpting","excel","code"] |
| 3 | Cat | ["pastel","code","editing"] |
+------+------+-----------------------------------------------+
I've tried something like SELECT * from art_skills_table LEFT JOIN computer_table ON name. However it doesn't give what I need. I've read about array_cat but I'm having a bit of trouble implementing it.
if the skills column from both tables are arrays, then you should be able to get away with this:
SELECT a.ID, a.name, array_cat(a.skills, c.skills)
FROM art_skills_table a LEFT JOIN computer_table c
ON c.id = a.id
That said, While you used LEFT join in your sample, I think either an INNER or FULL (OUTER) join might serve you better.
First, i wondered why the data are stored in such a model.
Was of the opinion that NoSQL databases lack ability for joins and ...
... a semantic triple would be in the form of subject–predicate–object.
... a Key-value (KV) stores use associative arrays.
... a relational database would be normalized.
A few information about the use case would have helped.
Nevertheless, you can select the data with CONCAT and REPLACE for the desired form.
SELECT art_skills_table.ID, computer_table.name,
CONCAT(
REPLACE(art_skills_table.skills, '}',','),
REPLACE(computer_table.skills, '{','')
)
FROM art_skills_table JOIN computer_table ON art_skills_table.ID = computer_table.ID
The query returns the following result:
+----+------+--------------------------------------------+
| ID | Name | Skills |
+----+------+--------------------------------------------+
| 1 | Anna | {"painting","photography","word","typing"} |
| 2 | Cat | {"drawing","sculpting","code","editing"} |
| 3 | Bob | {"pastel","excel","code"} |
+----+------+--------------------------------------------+
I've used the ID for the JOIN, even though Bob has different values.
The JOIN should probably be done over the name.
JOIN computer_table ON art_skills_table.Name = computer_table.Name
BTW, you need to tell us what SQL engine you're running on.

Return list of tables and count in single query

I know about the describe command \d and select count(*) from my_schema_1.my_table_1;. However I'd like to get a neat list of the entire database, I have quite a few tables. Something like below would be nice.
my_schema_1 | mytable_1 | 12323
my_schema_2 | mytable_2 | 0
I'd basically like to loop over all the tables.
Maybe something like this (no need to execute a COUNT(*)) for each table):
EDIT new version to consider tables without projections:
SELECT
t.table_schema AS schema,
t.table_name AS table,
ZEROIFNULL(
CASE WHEN p.is_segmented IS TRUE
THEN SUM(ps.row_count) * COUNT(DISTINCT ps.node_name) // COUNT(ps.node_name)
ELSE MAX(ps.row_count)
END
) AS row_count,
CASE WHEN p.is_segmented THEN 'Yes' ELSE 'No' END AS segmented,
COUNT(DISTINCT p.projection_id) AS num_proj
FROM
v_catalog.tables t
LEFT OUTER JOIN v_monitor.projection_storage ps
ON t.table_id = ps.anchor_table_id
LEFT OUTER JOIN v_catalog.projections p
ON t.table_id = p.anchor_table_id
AND p.is_super_projection IS TRUE
GROUP BY
t.table_schema, t.table_name, p.is_segmented
ORDER BY
t.table_schema, t.table_name
;
Sample output:
schema | table | row_count | segmented | num_proj
--------+------------------------+-----------+-----------+----------
mauro | city | 5 | Yes | 2
mauro | employees | 1000000 | Yes | 2
mauro | empty | 0 | No | 0
mauro | fnames | 20 | Yes | 2
...
tpch | customer | 0 | Yes | 2
tpch | lineitem | 54010935 | Yes | 2
tpch | nation | 25 | No | 1
tpch | orders | 718277000 | Yes | 2
I did add a couple of columns: segmented (Yes/No) and num_proj. You can remove them if you want.