Postgresql array_agg, INNER JOIN and LEFT JOIN problems - sql

I have a slight problem with one of my query. The goal of this query is to get all the table1 items of a user and their information. As you can see, the data model is quite complex (for good reasons), and this requires an big query (my goal is to gather everything with one query only).
Here is the data model :
What I want :
All T1 info
All T2 info for one T1 item (it is a 1 to n relations, so I'll use array_agg)
All T3 info for one T1 item
All T4 info for one T1 item
All T6 info for one T1 item
i18n info for the T1 itemp
Here are the table1_table2 and table4_table6 SELECT * :
table1_id | table2_id
-------------+---------------
item2id | table2item1
item4id | table2item2
item4id | table2item1
item5id | table2item3
item5id | table2item2
table4_id | table6_id
------------------+--------------------
table4item1 | table6item1
table4item1 | table6item2
table4item2 | table6item2
table4item3 | table6item3
table4item1 | table6item3
table4item2 | table6item3
Here are the Table1 SELECT with id and its foreign key.
table1_id | table3_id
------------------------
item1id | table3item1
item2id | table3item1
item6id | table3item4
item3id | table3item2
item4id | table3item2
item5id | table3item3
Same for table3 :
table3_id | table4_id
------------+--------------
table3item1 | table4item1
table3item4 | table4item1
table3item2 | table4item2
table3item3 | table4item3
Finally, here is my query :
SELECT t1.id,
na.name,
array_to_json(array_agg(row_to_json(t2))) AS table2items,
array_to_json(array_agg(row_to_json(t6))) AS table6items
FROM table1 t1
INNER JOIN table1_i18n na ON na.table1_id = t1.id
INNER JOIN table3 t3 ON t3.id = t1.table3_id
INNER JOIN table4 t4 ON t4.id = t3.table4_id
LEFT JOIN table1_table2 t1t2 ON t1t2.table1_id = t1.id
LEFT JOIN table2 t2 ON t2.id = t1t2.table2_id
LEFT JOIN table4_table6 t5_t6 ON t5_t6.table5_id = t3.table4_id
LEFT JOIN table6 t6 ON t6.id = t5_t6.table6_id
WHERE t1.user_id = 'myuserid' AND na.lang = 'en_US'
GROUP BY t1.id, na.name, t4.id
ORDER BY t1.id;
Here is the result :
id | name | table3_id | table4_id | table2items | table6items
-------------+------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
item1id | MyFirstItem | table3item1 | table4item1 | [null,null,null] | [{"id":"table6item1"},{"id":"table6item2"},{"id":"table6item3"}]
item2id | MySecondItem | table3item1 | table4item1 | [{"table2item1","data1":"damage","data2":10},{"id":"table2item1","data1":"damage","data2":10},{"id":"table2item1","data1":"damage","data2":10}] | [{"id":"table6item1"},{"id":"table6item2"},{"id":"table6item3"}]
item3id | MyThirdItem | table3item2 | table4item2 | [null,null] | [{"id":"table6item2"},{"id":"table6item3"}]
item4id | MyFourthItem | table3item2 | table4item2 | [{"id":"table2item2","data1":"range","data2":20},{"id":"table2item1","data1":"damage","data2":10},{"id":"table2item2","data1":"range","data2":20},{"id":"table2item1","data1":"damage","data2":10}] | [{"id":"table6item2"},{"id":"table6item3"},{"id":"table6item3"},{"id":"table6item2"}]
item5id | MyFifthItem | table3item3 | table4item3 | [{"id":"table2item3","data1":"range","data2":20},{"id":"table2item2","data1":"range","data2":20}] | [{"id":"table6item3"},{"id":"table6item3"}]
item6id | MySixthItem | table3item4 | table4item1 | [null,null,null] | [{"id":"table6item2"},{"id":"table6item1"},{"id":"table6item3"}]
Well, I've got a problem here. As you can see, my table2_items and table6_items arrays have the same size. I don't know the reason for this, but it seems that I'm missing something.
Worse, instead of filling this array with null value, this query creates duplicates which should not appear.
Details :
item1 and item6 have the same problem : no links to table2, and 3 items in table6. I end up with an array [null, null, null] for table2_items
item2 has 3 links to table 6, and 1 to table2. I end up with 3 times the same table2 object in the array
item4... I don't know what's happening here. Should have 2 things in each array, and I've got 4 (duplicates)
item5 : you can clearly see the duplication.
I have tried to group by table6.id, or table2.id. It doesn't work (I have got a line for each of them, so several line for each item).
Note : If I do
SELECT t1.id,
na.name,
array_to_json(array_agg(row_to_json(t2))) AS table2items,
FROM table1 t1
INNER JOIN table1_i18n na ON na.table1_id = t1.id
INNER JOIN table3 t3 ON t3.id = t1.table3_id
INNER JOIN table4 t4 ON t4.id = t3.table4_id
LEFT JOIN table1_table2 t1t2 ON t1t2.table1_id = t1.id
LEFT JOIN table2 t2 ON t2.id = t1t2.table2_id
WHERE t1.user_id = 'myuserid' AND na.lang = 'en_US'
GROUP BY t1.id, na.name, t4.id
ORDER BY t1.id;
alone, it works perfectly. Same for t6. It's only when I try to gather everything at the same time that I got some problems.
If it is not clear enough, ask for details. It's really not easy to explain such a problem :).

Related

How to populate a table based on a value from a different table

I have two tables of data which I can join using a left join linked on the ID in both tables. Where the course and the person are the same, I need to populate the RegNumber as the same as the RegNumber which is already there for 1 row:
How it is currently: if I join table 1 and table 2 with a left join.
Table 1
ID | Course| Person
67705 | A | 1
68521 | A | 1
85742 | A | 1
89625 | A | 1
67857 | B | 2
86694 | B | 2
88075 | B | 2
88710 | C | 3
47924 | C | 3
66981 | C | 3
12311 | B | 1
12312 | B | 1
12313 | B | 1
Table 2
ID | RegNumber
67705 | N712316
NULL | NULL
NULL | NULL
NULL | NULL
67857 | N712338
NULL | NULL
NULL | NULL
NULL | NULL
47924 | M481035
NULL | NULL
12311 | N645525
NULL | NULL
NULL | NULL
I need table 2 to look like this:
ID | RegNumber
67705 | N712316
68521 | N712316
85742 | N712316
89625 | N712316
67857 | N712338
86694 | N712338
88075 | N712338
88710 | N712338
47924 | M481035
66981 | M481035
12311 | N645525
12312 | N645525
12313 | N645525
That is, I need to insert new rows into Table 2
Can anyone help me please? This is Totally beyond my capability!
insert into table2 (ID,RegNumber)
select t1.ID,reg.regNumber
from table1 t1
cross join (select top 1 regNumber from table2 r2 join table1 r1
on r1.Id = r2.Id
and r1.Course = t1.Course
and r1.Person = t1.person
order by id) reg
where not exists (select 1 from table2 t2 where t1.ID = t2.ID)
you can improve performance a little bit by loading data into temp table first :
select t1.ID , Course,Person,regNumber
into #LoadedData
from table1 t1
join table2 t2 on t1.Id = t2.ID
insert into table2 (ID,RegNumber)
select t1.ID,reg.regNumber
from table1 t1
cross join (select top 1 regNumber from #LoadedData l
where l.Course = t1.Course
and l.Person = t1.person
order by id) reg
where not exists (select 1 from #LoadedData l where t1.ID = l.ID)
in either case having an index on (ID, Course, Person) will help with performance
Assuming:
You are missing items in table 2 that inherit data from other records in table 1.
What makes two different IDs share the same Regnumber is to have BOTH course and person number in common.
You really need to join table 1 to itself to create the mapping that associates ID 67705 with ID 68521, then you can join in table 2 to pick up the Regnumber.
Try this:
Insert into table2 (ID,RegNumber)
Select right1.ID, left2.RegNumber
From (
(table2 left2 INNER JOIN
table1 left1 On (left1.ID=left2.ID)
INNER JOIN table1 right1 On (left1.Course=right1.Course AND left1.Person=right1.Person)
) LEFT OUTER JOIN table2 right2 On (right1.ID=right2.ID)
WHERE right2.ID Is Null
The 4th table join (alias right2) is purely defensive, to handle two records in table2 having identical Person & Course in table1.
I have solved this myself.
I concatenated the person and course columns and then joined them using that new concatenated field
insert into table 2 (ID,RegNumber)
select X1.ID,X2.Regnumber
from (select concat(course,person) as X,ID from table1) X1
join (select concat(t1.course,t1.person) as X, t2.RegNumber
from table1 t1
join table2 t2 on t1.ID = t2.ID) X2
on X1.X = X2.X
where X1.ID not in (select ID from table2)

SQL Query subquery in select

I have 2 tables.
Table1
+----+------+
| Id | Name |
+----+------+
| | |
+----+------+
Table2
+-----+-----------+------+-------+---------+
| Id | Table1_ID | Name | Value | Created |
+-----+-----------+------+-------+---------+
| | | | | |
+-----+-----------+------+-------+---------+
When I run a SELECT * FROM Table2, I want the Table1_ID to be replaced with the name of that item ID from Table 1, rather than the ID. How can I do that?
User Inner join, Like this
SELECT
T2.Id
T1_Name = T1.Name ,--Table1_ID
T2_Name = T2.Name
T2.Value
T2.Created
FROM Table1 T1
INNER JOIN Table2 T2
ON T1.ID = T2.Table1_ID
You can use INNER JOIN for that.
INNER JOIN Syntax 1
SELECT *
FROM table1
INNER JOIN table2 ON table1.id = table2.fk_id
INNER JOIN Syntax 2
SELECT *
FROM table1
INNER JOIN table2
WHERE table1.id = table2.fk_id
  
SELECT Table2.Id, Table2.Name, Table1.Name, Table2.Value, Table2.Created
FROM Table2
INNER JOIN Table1 ON Table1.ID = Table2.Table1_ID
Recommended Readings
http://sql.sh/cours/jointures/inner-join
https://www.w3schools.com/sql/sql_join_inner
https://www.tutorialspoint.com/sql/sql-inner-joins

What is the correct way from performance perspective to match(replace) every value in every row in temp table using SQL Server 2016 or 2017?

I am wondering what should I use in SQL Server 2016 or 2017 (CTE, LOOP, JOINS, CURSOR, REPLACE, etc) to match (replace) every value in every row in temp table? What is the best solution from performance perspective?
Source Table
|id |id2|
| 1 | 2 |
| 2 | 1 |
| 1 | 1 |
| 2 | 2 |
Mapping Table
|id |newid|
| 1 | 3 |
| 2 | 4 |
Expected result
|id |id2|
| 3 | 4 |
| 4 | 3 |
| 3 | 3 |
| 4 | 4 |
You may join the second table to the first table twice:
WITH cte AS (
SELECT
t1.id AS id_old,
t1.id2 AS id2_old,
t2a.newid AS id_new,
t2b.newid AS id2_new
FROM table1 t1
LEFT JOIN table2 t2a
ON t1.id = t2a.id
LEFT JOIN table2 t2b
ON t1.id2 = t2b.id
)
UPDATE cte
SET
id_old = id_new,
id2_old = id2_new;
Demo
Not sure if you want just a select here, or maybe an update, or an insert into another table. In any case, the core logic I gave above should work for all these cases.
You'd need to apply joins on update query. Something like this:
Update tblA set column1 = 'something', column2 = 'something'
from actualName tblA
inner join MappingTable tblB
on tblA.ID = tblB.ID
this query will compare eachrow with ids and if matched then it will update/replace the value of the column as you desire. :)
Do the self join only
SELECT t1.id2 as id, t2.id2
FROM table1 t
INNER JOIN table2 t1 on t1.id = t.id
INNER JOIN table2 t2 on t2.id = t.id2
This may have best performance from solutions posted here if you have indexes set appropriately:
select (select [newid] from MappingTable where id = [ST].[id]) [id],
(select [newid] from MappingTable where id = [ST].[id2]) [id2]
from SourecTable [ST]

Merging multiple tables sharing a column

I have a list of tables like this:
t1
ID | Name
3 | 'AAA'
4 | 'BBB'
5 | 'CCC'
6 | 'DDD'
7 | 'EEE'
t2
ID | Password
3 | 'test'
6 | 'password'
t3
ID | Birth Year | Last Name
4 | 1990 | 'John'
6 | 1988 | 'Megan'
7 | - | 'Bob'
t4
ID | Birth Year
7 | 1985
I want to merge them all into this, noticing that t3 and t4 both have birth year columns, but the value will only be in either one.
ID | Name | Password | Birth Year | Last Name
3 | 'AAA' | 'test' | - | -
4 | 'BBB' | - | 1990 | 'John'
5 | 'CCC' | - | - | -
6 | 'DDD' |'password'| 1988 | 'Megan'
7 | 'EEE' | - | 1985 | 'Bob'
Does anyone know how this can be done? t1 is the "master" table, so it will always contain all the IDs.
I've tried:
select * \
from t1 \
LEFT outer join t2 on t1.ID = t2.ID \
LEFT outer join t3 on t1.ID = t3.ID \
LEFT outer join t4 on t1.ID = t4.ID
But it doesnt work properly, it has separate columns for each individual columns in t1, t2, t3, t4
select t1.ID,
t1.Name,
t2.Password,
COALESCE(t3.[Birth Year],t4.[Birth Year]), t3.[Last Name]
from t1
LEFT outer join t2 on t1.ID = t2.ID
LEFT outer join t3 on t1.ID = t3.ID
LEFT outer join t4 on t1.ID = t4.ID
you can selectively left join t4 to t1, then show t4.BirthYear if it is not null, otherwise t3.BirthYear. like this:
SELECT t1.ID, t1.Name, t2.Password, COALESCE(t4.BirthYear, t3.BirthYear) as BirthYear, t3,LastName
FROM t1
LEFT OUTER JOIN t2 on t1.ID = t2.ID
LEFT OUTER JOIN t3 on t1.ID = t3.ID
LEFT OUTER JOIN t4 on t1.ID = t4.ID and t4.BirthYear is NOT NULL

Finding unmatched records with SQL

I'm trying write a query to find records which don't have a matching record in another table.
For example, I have a two tables whose structures looks something like this:
Table1
State | Product | Distributor | other fields
CA | P1 | A | xxxx
OR | P1 | A | xxxx
OR | P1 | B | xxxx
OR | P1 | X | xxxx
WA | P1 | X | xxxx
VA | P2 | A | xxxx
Table2
State | Product | Version | other fields
CA | P1 | 1.0 | xxxx
OR | P1 | 1.5 | xxxx
WA | P1 | 1.0 | xxxx
VA | P2 | 1.2 | xxxx
(State/Product/Distributor together form the key for Table1. State/Product is the key for Table2)
I want to find all the State/Product/Version combinations which are Not using distributor X. (So the result in this example is CA-P1-1.0, and VA-P2-1.2.)
Any suggestions on a query to do this?
SELECT
*
FROM
Table2 T2
WHERE
NOT EXISTS (SELECT *
FROM
Table1 T1
WHERE
T1.State = T2.State AND
T1.Product = T2.Product AND
T1.Distributor = 'X')
This should be ANSI compliant.
In T-SQL:
SELECT DISTINCT Table2.State, Table2.Product, Table2.Version
FROM Table2
LEFT JOIN Table1 ON Table1.State = Table2.State AND Table1.Product = Table2.Product AND Table1.Distributor = 'X'
WHERE Table1.Distributor IS NULL
No subqueries required.
Edit: As the comments indicate, the DISTINCT is not necessary. Thanks!
select * from table1 where state not in (select state from table1 where distributor = 'X')
Probably not the most clever but that should work.
SELECT DISTINCT t2.State, t2.Product, t2.Version
FROM table2 t2
JOIN table1 t1 ON t1.State = t2.State AND t1.Product = t2.Product
AND t1.Distributor <> 'X'
In Oracle:
SELECT t2.State, t2.Product, t2.Version
FROM Table2 t2, Table t1
WHERE t1.State(+) = t2.State
AND t1.Product(+) = t2.Product
AND t1.Distributor(+) = :distributor
AND t1.State IS NULL