Join Postgresql tables on json array columns - sql

I have two tables in postgresql with json array columnstableA.B and tableB.B. How to join these tables on these json columns?
i.e.
select tableA.id, tablA.A, tableB.id,tableA.B, tableB.name
from tableA, tableB
where tableA.B = tableB.B
--tableA--
id | A | B
1 | 36464 | ["874746", "474657"]
2 | 36465 | ["874748"]
3 | 36466 | ["874736", "474654"]
--tableB--
id | name | B
1 | john | ["8740246", "2474657"]
2 | mary | ["874748","874736"]
3 | clara | ["874736", "474654"]

Actually, with the data type jsonb in Postgres 9.4 or later, this becomes dead simple. Your query would just work (ugly naming convention, code and duplicate names in the output aside).
CREATE TEMP TABLE table_a(a_id int, a int, b jsonb);
INSERT INTO table_a VALUES
(1, 36464, '["874746", "474657"]')
, (2, 36465, '["874748"]')
, (3, 36466, '["874736", "474654"]');
CREATE TEMP TABLE table_b(b_id int, name text, b jsonb);
INSERT INTO table_b VALUES
(1, 'john' , '["8740246", "2474657"]')
, (2, 'mary' , '["874748","874736"]')
, (3, 'clara', '["874736", "474654"]');
Query:
SELECT a_id, a, b.*
FROM table_a a
JOIN table_b b USING (b); -- match on the whole jsonb column
That you even ask indicates you are using the data type json, for which no equality operator exists:
How to query a json column for empty objects?
You just didn't mention the most important details.
The obvious solution is to switch to jsonb.
Answer to your comment
is it possible to flatten out b into new rows rather than an array?
Use jsonb_array_elements(jsonb) or jsonb_array_elements_text(jsonb) in a LATERAL join:
SELECT a_id, a, b.b_id, b.name, b_array_element
FROM table_a a
JOIN table_b b USING (b)
, jsonb_array_elements_text(b) b_array_element
This returns only rows matching on the whole array. About LATERAL:
What is the difference between LATERAL and a subquery in PostgreSQL?
If you want to match on array elements instead, unnest your arrays before you join.
The whole setup seems to be in dire need of normalization.

WITH tableA AS
(SELECT 1 AS id,
36464 AS A,
'["874746", "474657"]'::jsonb AS B
UNION SELECT 2 AS id,
36465 AS A,
'["874748"]'::jsonb AS B
UNION SELECT 3 AS id,
36466 AS A,
'["874736", "474654"]'::jsonb AS B),
tableB AS
( SELECT 1 AS id,
'john' AS name,
'["8740246", "2474657"]'::jsonb AS B
UNION SELECT 2 AS id,
'mary' AS name,
'["874748", "874736"]'::jsonb AS B
UNION SELECT 3 AS id,
'clara' AS name,
'["874736", "474654"]'::jsonb AS B)
SELECT *
FROM tableA
inner join tableB using(B);
Gives you
b | id | a | id | name
----------------------+----+-------+----+-------
["874736", "474654"] | 3 | 36466 | 3 | clara
Isn't it what you expect?

Related

How can I get full results in SQL query using 3 tables, where 1 of them keeps relation of 2 another?

I need help writing a query to display results I want.
"Table 3 - relations" keeps all relations between table 1 and 2.Often, relation between table 1 and 2 will not exist in table 3 so I want to see missing relation in the results for all Table 1 rows - see expected Results below.
I can't modify these tables - I have only SELECT privilege.
Data and expected result below:
Table 1 - a:
a_id, a_name
e.g.:
1 A
2 B
Table 2 - b:
b_id, b_name
e.g.:
1 X
2 Y
Table 3 - relation:
asset1_id (it's always id from Table 1), asset2_id (it's always id from Table 2), relation_type
e.g.:
1 1 covers
1 2 covers
Expected result:
Table1_name, Table2_name, Table3_relation_type (including NULL for b_name and relation_type when such relation does not exist in Table 3 - relation)
e.g.
A X covers
A Y covers
B NULL NULL
I can't get the 3rd expected line with NULLs.
I think that this query will produce those results.
select a.name as a_name,b.name as b_name, r.relation_type from relation r
join a on a.id=r.asset1_id
join b on b.id=r.asset2_id
union
select a.name as a_name,b.name as b_name,r.relation_type from relation r
full outer join a on a.id=r.asset1_id
full outer join b on b.id=r.asset2_id
where a.id is null or b.id is null
With your data sample you could try this one.
It should work both hive or impala.
SELECT t1.name ,t2.name ,r.relation_type
FROM relation r
FULL OUTER JOIN table1 t1 ON(t1.id = r.id1)
FULL OUTER JOIN table2 t2 ON(t2.id = r.id2);
+------+------+---------------+
| name | name | relation_type |
+------+------+---------------+
| A | X | covers |
| A | Y | covers |
| B | NULL | NULL |
+------+------+---------------+
WITH
cte_A AS (
SELECT id as a_id, name as a_name
FROM a
),
cte_C AS (
SELECT c.asset_id1 as a_id, b.name, c.relation
FROM c
LEFT JOIN b ON c.id=b.asset_id2
)
SELECT cte_A.a_name, cte_C.name as c_name, cte_C.relation
FROM cte_A
LEFT JOIN cte_C ON cte_A.a_id=cte_C.a_id

Select data in tables by conditions between 2 tables which is not linked

I have a large select with a lot of inner joins.
In the select I have an array_agg function for one set of data.
This array contains only a column of a table, but now I want to append at the end of the array data from another table. The data I need to add is not directly linked with the previous table where I need the column.
Query example:
select
origin_table.x,
origin_table.y,
array_agg(table1.data) ...
from
origin_table
inner join ... inner join ... full join table1 on
table.origin_table_id = origin_table.id ...
group by
...
Result array:
ID 1: example_data, {baba, bobo}
ID 2: example_data, {bibi, bubu}
Example of my tables:
table 1:
id | data | origin_table_id
----+---------+----------
1 | baba | 1
2 | bobo | 1
3 | bibi | 2
4 | bubu | 2
table 2:
id | data_bis
---+---------
1 | byby
2 | bebe
origin table:
id | table2_id
---+----------
1 | 2
2 | 1
Expected result with the 3 tables:
ID 1: example_data, {baba, bobo, bebe}
ID 2: example_data, {bibi, bubu, byby}
But got :
ID 1: example_data, {baba, bobo, bebe, byby}
ID 2: example_data, {bibi, bubu, bebe, byby}
What I need is:
How to have all the data of table 1 which respect the condition and append to it the unique table 2 data but not all elements of the table.
Try below query..
create table tab1 (id integer,data character varying,origin_table_id integer);
insert into tab1
select 1,'baba',1
union all
select 2,'bobo',1
union all
select 3,'bibi',2
union all
select 4,'bubu',2
create table tab2 (id integer,data_bis character varying);
insert into tab2
select 1,'byby'
union all
select 2,'bebe'
create table OriginalTable (id integer,table2_id integer);
insert into OriginalTable
select 1,2
union all
select 2,1
select * from OriginalTable
select origin_table_id,data
from tab1
union all
select OriginalTable.table2_id,data_bis
from OriginalTable
join tab2 on tab2.id = OriginalTable.id
order by origin_table_id
Result:
1;"baba"
1;"bobo"
1;"bebe"
2;"bibi"
2;"bubu"
2;"byby"
I can give you some idea and sample code to achieve your requirement.
First you can UNION ALL table 'table 1' and 'table 2' using the relation table 'origin table'.
WITH CTE
AS
(
SELECT AA.id,AA.data_bis,AA.origin_table_id
FROM table_1 AA
UNION ALL
SELECT NULL id, A.data_bis,B.id origin_table_id
FROM table_2 A
INNER JOIN origin_table B
ON A.id = B.table2_id
)
SELECT * FROM CTE
After applying UNION ALL, data will be looks like below-
data_bis origin_table_id
baba 1
bobo 1
bibi 2
bubu 2
byby 2
bebe 1
Now you can apply 'string_agg' on your data as below-
SELECT origin_table_id,string_agg(DISTINCT data_bis,',')
FROM CTE
GROUP BY origin_table_id
And the output will be-
origin_table_id string_agg
1 baba,bebe,bobo
2 bibi,bubu,byby
Now, you can apply further JOINING to this data as per your requirement. You can check DEMO HERE
Please keep in mind hat this is not exact solution of your issue. Just idea...

SQL query - select row basing on the joined table's criteria

I'm stuck within this simple scenario:
tableA
| ID | TableB_ID | Name |
tableA 1 ---> * tableB
tableB
| ID | Status_ID |
and I need to retrieve the Name column values from the tableA, whose contains rows in the tableB with Status_ID = 1 and Status_ID = 2 (two separate rows, it can be more with other values but it doesn't matter here)
Try this:
SELECT A.NAME
FROM TABLE_A AS A INNER JOIN TABLE_B AS B ON A.TABLEB_ID = B.ID
WHERE B.STATUS_ID IN (1, 2) -- OR OTHER VALUES

Populate "Lookup Table" with random values

I have three tables, A B and C. For every entry in A x B (where x is a Cartesian product, or cross join) there is an entry in C.
In other words, the table for C might look like this, if there were 2 entries for A and 3 for B:
| A_ID | B_ID | C_Val |
----------------------|
| 1 | 1 | 100 |
| 1 | 2 | 56 |
| 1 | 3 | 19 |
| 2 | 1 | 67 |
| 2 | 2 | 0 |
| 2 | 3 | 99 |
Thus, for any combination of A and B, there's a value to be looked up in C. I hope this all makes sense.
In practice, the size of A x B may be relatively small for a database, but far too large to populate by hand for testing data. Thus, I would like to randomlly populate C's table for whatever data may already be in A and B.
My knowledge of SQL is fairly basic. What I've determined I can do so far is get that cartesian product as an inner query, like so:
(SELECT B.B_ID, C.C_ID
FROM B CROSS JOIN C)
Then I want to say something like follows:
INSERT INTO A(B_ID, C_ID, A_Val) VALUES
(SELECT B.B_ID, C.C_ID, FLOOR(RAND() * 100)
FROM B CROSS JOIN C)
Not surprisingly, this doesn't work. I don't think its valid syntax to genereate a column on the fly like that, nor to try to insert a whole table as values.
How can I basically convert this normal programming pseudocode to proper SQL?
foreach(A_ID in A){
foreach(B_ID in B){
C.insert(A_ID, B_ID, Rand(100));
}
}
The syntax problem is because:
INSERT INTO A(B_ID, C_ID, A_Val) VALUES
(SELECT B.B_ID, C.C_ID, FLOOR(RAND() * 100)
FROM B CROSS JOIN C)
Should be:
INSERT INTO A(B_ID, C_ID, A_Val)
SELECT B.B_ID, C.C_ID, FLOOR(RAND() * 100)
FROM B CROSS JOIN C;
(You don't use VALUES with INSERT/SELECT.)
However you will still have the problem that RAND() is not evaluated for every row; it will have the same value for every row. Assuming the combination of B_ID and C_ID is unique, you can use something like this:
INSERT INTO A(B_ID, C_ID, A_Val)
SELECT B.B_ID, C.C_ID, ABS(CHEKSUM(RAND(B.B_ID*C.C_ID))) % 100
FROM B CROSS JOIN C;
select A_id,B_Id, abs(checksum(newid()))%101 as C_val from A cross join B
This will give you different values in ranmge 0 to 100
Use CTE
With cte as
(SELECT B.B_ID, C.C_ID, ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) as A_Val
FROM B CROSS JOIN C)
Insert into Table(B_ID, C_ID, A_Val)
Select B_ID,C_ID,A_Val from cte
Since rand generates the same number you can use NEWID .Source

How do I print out 'NULL' or '0' values for column values when an element isn't found?

I need to loop through a set of values (less than 10) and see if they are in a table. If so, I need to print out all of the record values, but if the item doesn't exist, I still want it to be included in the printed result, although with NULL or 0 values. So, for example, the following query returns:
select *
from ACTOR
where ID in (4, 5, 15);
+----+-----------------------------+-------------+----------+------+
| ID | NAME | DESCRIPTION | ORDER_ID | TYPE |
+----+-----------------------------+-------------+----------+------+
| 4 | [TEST-1] | | 3 | NULL |
| 5 | [TEST-2] | | 4 | NULL |
+----+-----------------------------+-------------+----------+------+
But I want it to return
+----+-----------------------------+-------------+----------+------+
| ID | NAME | DESCRIPTION | ORDER_ID | TYPE |
+----+-----------------------------+-------------+----------+------+
| 4 | [TEST-1] | | 3 | NULL |
| 5 | [TEST-2] | | 4 | NULL |
| 15| NULL | | 0 | NULL |
+----+-----------------------------+-------------+----------+------+
Is this possible?
To get the output you want, you first have to construct a derived table containing the ACTOR.id values you desire. UNION ALL works for small data sets:
SELECT *
FROM (SELECT 4 AS actor_id
FROM DUAL
UNION ALL
SELECT 5
FROM DUAL
UNION ALL
SELECT 15
FROM DUAL) x
With that, you can OUTER JOIN to the actual table to get the results you want:
SELECT x.actor_id,
a.name,
a.description,
a.orderid,
a.type
FROM (SELECT 4 AS actor_id
FROM DUAL
UNION ALL
SELECT 5
FROM DUAL
UNION ALL
SELECT 15
FROM DUAL) x
LEFT JOIN ACTOR a ON a.id = x.actor_id
If there's no match between x and a, the a columns will be null. So if you want orderid to be zero when there's no match for id 15:
SELECT x.actor_id,
a.name,
a.description,
COALESCE(a.orderid, 0) AS orderid,
a.type
FROM (SELECT 4 AS actor_id
FROM DUAL
UNION ALL
SELECT 5
FROM DUAL
UNION ALL
SELECT 15
FROM DUAL) x
LEFT JOIN ACTOR a ON a.id = x.actor_id
Well, for that few values, you could do something ugly like this, I suppose:
SELECT
*
FROM
(
SELECT 4 AS id UNION
SELECT 5 UNION
SELECT 15
) ids
LEFT JOIN ACTOR ON ids.id = ACTOR.ID
(That should work in MySQL, I think; for Oracle you'd need to use DUAL, e.g. SELECT 4 as id FROM DUAL...)
That is only possible using a temporary table.
CREATE TABLE actor_temp (id INTEGER);
INSERT INTO actor_temp VALUES(4);
INSERT INTO actor_temp VALUES(5);
INSERT INTO actor_temp VALUES(15);
select actor_temp.id, ACTOR.* from ACTOR RIGHT JOIN actor_temp on ACTOR.id = actor_temp.id;
DROP TABLE actor_temp;
If you know the upper and lower limits on the ID, it's not too bad. Set up a view with all possible ids - the connect by trick is the simplest way - and do an outer join with your real table. Here, I've limited it to values from 1-1000.
select * from (
select ids.id, a.name, a.description, nvl(a.order_id,0), a.type
from Actor a,
(SELECT level as id from dual CONNECT BY LEVEL <= 1000) ids
where ids.id = a.id (+)
)
where id in (4,5,15);
Can you make a table that contains expected actor ids?
If so you can left join from it.
SELECT * FROM expected_actors LEFT JOIN actors USING (ID)