Select data in tables by conditions between 2 tables which is not linked - sql

I have a large select with a lot of inner joins.
In the select I have an array_agg function for one set of data.
This array contains only a column of a table, but now I want to append at the end of the array data from another table. The data I need to add is not directly linked with the previous table where I need the column.
Query example:
select
origin_table.x,
origin_table.y,
array_agg(table1.data) ...
from
origin_table
inner join ... inner join ... full join table1 on
table.origin_table_id = origin_table.id ...
group by
...
Result array:
ID 1: example_data, {baba, bobo}
ID 2: example_data, {bibi, bubu}
Example of my tables:
table 1:
id | data | origin_table_id
----+---------+----------
1 | baba | 1
2 | bobo | 1
3 | bibi | 2
4 | bubu | 2
table 2:
id | data_bis
---+---------
1 | byby
2 | bebe
origin table:
id | table2_id
---+----------
1 | 2
2 | 1
Expected result with the 3 tables:
ID 1: example_data, {baba, bobo, bebe}
ID 2: example_data, {bibi, bubu, byby}
But got :
ID 1: example_data, {baba, bobo, bebe, byby}
ID 2: example_data, {bibi, bubu, bebe, byby}
What I need is:
How to have all the data of table 1 which respect the condition and append to it the unique table 2 data but not all elements of the table.

Try below query..
create table tab1 (id integer,data character varying,origin_table_id integer);
insert into tab1
select 1,'baba',1
union all
select 2,'bobo',1
union all
select 3,'bibi',2
union all
select 4,'bubu',2
create table tab2 (id integer,data_bis character varying);
insert into tab2
select 1,'byby'
union all
select 2,'bebe'
create table OriginalTable (id integer,table2_id integer);
insert into OriginalTable
select 1,2
union all
select 2,1
select * from OriginalTable
select origin_table_id,data
from tab1
union all
select OriginalTable.table2_id,data_bis
from OriginalTable
join tab2 on tab2.id = OriginalTable.id
order by origin_table_id
Result:
1;"baba"
1;"bobo"
1;"bebe"
2;"bibi"
2;"bubu"
2;"byby"

I can give you some idea and sample code to achieve your requirement.
First you can UNION ALL table 'table 1' and 'table 2' using the relation table 'origin table'.
WITH CTE
AS
(
SELECT AA.id,AA.data_bis,AA.origin_table_id
FROM table_1 AA
UNION ALL
SELECT NULL id, A.data_bis,B.id origin_table_id
FROM table_2 A
INNER JOIN origin_table B
ON A.id = B.table2_id
)
SELECT * FROM CTE
After applying UNION ALL, data will be looks like below-
data_bis origin_table_id
baba 1
bobo 1
bibi 2
bubu 2
byby 2
bebe 1
Now you can apply 'string_agg' on your data as below-
SELECT origin_table_id,string_agg(DISTINCT data_bis,',')
FROM CTE
GROUP BY origin_table_id
And the output will be-
origin_table_id string_agg
1 baba,bebe,bobo
2 bibi,bubu,byby
Now, you can apply further JOINING to this data as per your requirement. You can check DEMO HERE
Please keep in mind hat this is not exact solution of your issue. Just idea...

Related

SQL counting columns with same userID

I have two tables. Table 1 contains content_id's that meets a certain criteria. Table 2 contains the content_id, content, and related user_id. They share a content_id field. I would like to produce a list of who has the most entries in Table 1.
Example
Table 1
content_id {1, 2, 3, 4, 5, 6}
Table 2
content_id|user_id { 1|2 , 2|3 , 3|2 , 4|1 , 5|3, 6|2 }
Desired results
user 2 has 3 entries
user 3 has 2 entries
user 1 has 1 entry
I imagine I need to INNER JOIN the two tables by content_id and then somehow use COUNT or similar?
Assuming you want to "filter" by the content of t1, one possibility is:
create table t1 (content_id int);
insert into t1 values (2, 3, 4, 5, 6);
-- note I ommitted 1 so not all values are present
create table t2 (content_id int, user_id int);
insert into t2 (values (1,2), (2,3) , (3,2) , (4,1) , (5,3), (6,2) );
select user_id, count(*) from t2 where exists (select 1 from t1 where content_id=t2.content_id) group by user_id;
-- output:
user_id | count
---------+-------
3 | 2
2 | 2
1 | 1
(3 rows)
-- OR
select user_id, count(*) from (select * from t2 except select 1 as user_id, content_id from t1) AS filtered group by user_id;
-- output:
user_id | count
---------+-------
3 | 2
2 | 2
1 | 1
(3 rows)
But other answers are already doing this too.
Here is the code snippet that will produce your desired result:
Select
tb2.user_id,count(tb2.content_id)
From table1 tb1
Inner join table2 tb2
on tb1.content_id=tb2.content_id
group by tb2.user_id
Are you looking for a simple group by?
select user_id, count(*)
from table2
group by user_id;
EDIT:
If you want to restrict the content ids, I recommend exists:
select t2.user_id, count(*)
from table2 t2
where exists (select 1
from table1 t1
where t1.content_id = t2.content_id
)
group by user_id;
I hope this what you looking for and help you
select user_id, count(`content_id`) from table2 t2
where exists (select t1.content_id from table1 t1
where t1.content_id=t2.content_id)
group by user_id order BY count(`content_id`) DESC;
+---------+---------------------+
| user_id | count(`content_id`) |
+---------+---------------------+
| 2 | 3 |
| 3 | 2 |
| 1 | 1 |
+---------+---------------------+

How to do selection in PostgreSQL with join when more than one row satisfies requirements?

How to do selection to get JSON array in one cell when doing INNER JOIN when there are more than 1 values to join?
ex Tables:
T1:
id | name
1 Tom
2 Dom
T2:
user_id | product
1 Milk
2 Cookies
2 Banana
Naturally I do SELECT * FROM T1 INNER JOIN T2 ON T1.id = T2.user_id.
But then I get:
id | Name | product
1 Tom Milk
2 Dom Cookies
2 Dom Banana
But I want to get:
id | Name | product
1 Tom [{"product":"Milk}]
2 Dom [{"product":"Cookies"}, {"product":"Banana"}]
If I do something with agg functions, then I need to put everything else in GROUP BY, where I have at least 10 arguments. And whole query takes more than 5 minutes.
My T1 is around 4000 rows and T2 around 300 000 rows, each associated with some row in T1.
Is there a better way?
Using LATERAL you can solve it as given example below:
-- The query
SELECT *
FROM table1 t1,
LATERAL ( SELECT jsonb_agg(
jsonb_build_object( 'product', product )
)
FROM table2
WHERE user_id = t1.id
) t2( product );
-- Result
id | name | product
----+------+-------------------------------------------------
1 | Tom | [{"product": "Milk"}]
2 | Dom | [{"product": "Cookies"}, {"product": "Banana"}]
(2 rows)
-- Test data
CREATE TABLE IF NOT EXISTS table1 (
id int,
"name" text
);
INSERT INTO table1
VALUES ( 1, 'Tom' ),
( 2, 'Dom' );
CREATE TABLE IF NOT EXISTS table2 (
user_id int,
product text
);
INSERT INTO table2
VALUES ( 1, 'Milk' ),
( 2, 'Cookies' ),
( 2, 'Banana' );

Join Postgresql tables on json array columns

I have two tables in postgresql with json array columnstableA.B and tableB.B. How to join these tables on these json columns?
i.e.
select tableA.id, tablA.A, tableB.id,tableA.B, tableB.name
from tableA, tableB
where tableA.B = tableB.B
--tableA--
id | A | B
1 | 36464 | ["874746", "474657"]
2 | 36465 | ["874748"]
3 | 36466 | ["874736", "474654"]
--tableB--
id | name | B
1 | john | ["8740246", "2474657"]
2 | mary | ["874748","874736"]
3 | clara | ["874736", "474654"]
Actually, with the data type jsonb in Postgres 9.4 or later, this becomes dead simple. Your query would just work (ugly naming convention, code and duplicate names in the output aside).
CREATE TEMP TABLE table_a(a_id int, a int, b jsonb);
INSERT INTO table_a VALUES
(1, 36464, '["874746", "474657"]')
, (2, 36465, '["874748"]')
, (3, 36466, '["874736", "474654"]');
CREATE TEMP TABLE table_b(b_id int, name text, b jsonb);
INSERT INTO table_b VALUES
(1, 'john' , '["8740246", "2474657"]')
, (2, 'mary' , '["874748","874736"]')
, (3, 'clara', '["874736", "474654"]');
Query:
SELECT a_id, a, b.*
FROM table_a a
JOIN table_b b USING (b); -- match on the whole jsonb column
That you even ask indicates you are using the data type json, for which no equality operator exists:
How to query a json column for empty objects?
You just didn't mention the most important details.
The obvious solution is to switch to jsonb.
Answer to your comment
is it possible to flatten out b into new rows rather than an array?
Use jsonb_array_elements(jsonb) or jsonb_array_elements_text(jsonb) in a LATERAL join:
SELECT a_id, a, b.b_id, b.name, b_array_element
FROM table_a a
JOIN table_b b USING (b)
, jsonb_array_elements_text(b) b_array_element
This returns only rows matching on the whole array. About LATERAL:
What is the difference between LATERAL and a subquery in PostgreSQL?
If you want to match on array elements instead, unnest your arrays before you join.
The whole setup seems to be in dire need of normalization.
WITH tableA AS
(SELECT 1 AS id,
36464 AS A,
'["874746", "474657"]'::jsonb AS B
UNION SELECT 2 AS id,
36465 AS A,
'["874748"]'::jsonb AS B
UNION SELECT 3 AS id,
36466 AS A,
'["874736", "474654"]'::jsonb AS B),
tableB AS
( SELECT 1 AS id,
'john' AS name,
'["8740246", "2474657"]'::jsonb AS B
UNION SELECT 2 AS id,
'mary' AS name,
'["874748", "874736"]'::jsonb AS B
UNION SELECT 3 AS id,
'clara' AS name,
'["874736", "474654"]'::jsonb AS B)
SELECT *
FROM tableA
inner join tableB using(B);
Gives you
b | id | a | id | name
----------------------+----+-------+----+-------
["874736", "474654"] | 3 | 36466 | 3 | clara
Isn't it what you expect?

Merging two tables into new table by ID and date

Let' say I have two tables:
table 1:
col a | col b | col c
1 | 62215 | 21
1 | 62015 | 22
2 | 62215 | 23
2 | 51315 | 24
and table 2:
col a | col b| col f
1 | 62015| z
1 | 62215| x
2 | 51315| y
2 | 62215| t
Where neither column a and column b are unique on their own, but the pairs (col a, col b) are all unique. How would I go about merging these two tables, to produce a
Table 3:
col a | col b| col c | col f
I want to combine these tables together, into one big table. So the new table has the values from column a and b from both tables, along with the columns unique to either table 1 or two.
I'm sure this is an extremely simple problem using MERGE or UNION but I don't use SQL at all, so I don't know how it would look like.
Thank you.
You can use a SELECT statement when inserting into a table. What I would do here is write a select statement that pulls all of the columns you need first. You will have to do a full outer join (simulated by a union of left and right joins) because some pairs may exist in one table but not the other. The select would look like this:
SELECT t1.colA, t1.colB, t1.colC, t2.colF
FROM tab1 t1
LEFT JOIN tab2 t2 ON t2.colA = t1.colA AND t2.colB = t1.colB
UNION
SELECT t1.colA, t1.colB, t1.colC, t2.colF
FROM tab1 t1
RIGHT JOIN tab2 t2 ON t2.colA = t1.colA AND t2.colB = t1.colB;
Then, to insert into table 3:
INSERT INTO tab3 (mySelect);
Here is an SQL Fiddle example.
Note that for the pairs that exist in one table and not the other, you will get NULL values. For example, if a row exists in table 1 and not table 2, colF will be null in table 3.
I'm not sure if I understand the data. Are you saying that colA and colB are unique across both tables? do you want to join or union?
As a union:
Select `col a`,`col b`, `col c`, null
from `table 1`
union
Select `col a`,`col b`, null, `col f`
from `table 2`
As a join:
Select `table 1`.`col a`,`table 1`.`col b`,`table 1`.`col c`,`table 2`.`col f`
from `table 1` join `table2`
on `table 1`.`col a` = `table 1`.`col a`
on `table 1`.`col b` = `table 1`.`col b`
to insert into a table put
insert into `table 3`
in front of it.
I hope this helps.
The union would produce 1 record for every record in each of table 1 and 2. A join will combine the data if col a and col b are the same in both tables. I think the latter is what you want. I didn't use a union all because that might create duplicates.

Show elements of where clause that are not present in table

I search a table based on an ID column in my where clause. I have a list of IDs that may or may not be present in this table. A simple query will give me the IDs which exist in that table (if any). Is there a way to also return ID's that were not found ?
Table --
ID
1GH
2BN
3ER
SELECT *
FROM Table
WHERE ID IN (big list 9FG, 1GH, 3UI etc)
--If ID's in above list are not in table, then show those ids.
Desired output -
9FG, 3UI were not found in the table
If I understand correctly what you need you can do it this way
SELECT q.id,
CASE WHEN t.id IS NULL THEN 'no' ELSE 'yes' END id_exists
FROM
(
SELECT '9FG' id UNION ALL
SELECT '1GH' UNION ALL
SELECT '3UI'
) q LEFT JOIN table1 t
ON q.id = t.id
Output:
| ID | ID_EXISTS |
|-----|-----------|
| 9FG | no |
| 1GH | yes |
| 3UI | no |
or if you just need a list of non-existent ids
SELECT q.id
FROM
(
SELECT '9FG' id UNION ALL
SELECT '1GH' UNION ALL
SELECT '3UI'
) q LEFT JOIN table1 t
ON q.id = t.id
WHERE t.id IS NULL
Output:
| ID |
|-----|
| 9FG |
| 3UI |
The trick is to use an OUTER JOIN instead of WHERE condition to filter data from your table and be able to see the mismatches.
Here is SQLFiddle demo
To search you can use
SELECT *
From Mytable
where id in (
select id from (values (1), (2), (3)) as SearchedIds(Id) )
and the opposite to find unamtched:
SELECT id from (values (1), (2), (3)) as SearchedIds(Id)
WHERE id not in (SELECT id From MyTable)
The syntax
Values(...) asSearchedIds(id)
is supported in Sql2008, for Sql2005 you have to do
( SELECT 1 as Id UNION ALL SELECT 2 UNION ALL ...etc ) as SearchedIds
Note: you can rewrite those queries with JOINS (INNER and LEFT)
Maybe something like:
SELECT id FROM my_table WHERE id NOT IN (val1, val2, val3)