how to join 2 tables into one with SQL - sql

I have 2 tables, Let's name them tb1 and tb2.
I want to add all items in tb1 that dose not exist in tb2 into new rows in tb2.
At the same time I want to update existing data in tb2 with the data in tb1, I try to understand join, merge and so on but I could not understand how doing that in SQL.
For the question I will build this 2 tables and the result I try to achieve.
tb1:
| KEY | col one | col two
+------+------------+-----------
| 1 | data one | data one
| 2 | data two | change data
| 3 | data three | data three
tb2:
| KEY | col one | col two
+------+-----------+-----------
| 1 | data one | data one
| 2 | data two | old data
| 4 | data four | some data
tb2 after SQL :
We can see we add the key 3 and we change in key 2, col 2 data
| KEY | col one | col two
+------+------------+-----------
| 1 | data one | data one
| 2 | data two | change data
| 3 | data three | data three
| 4 | data four | some data

You can generate the results you want using union all:
select t1.key, t1.col1, t1.col2
from table1 t1
union all
select t2.key, t2.col1, t2.col2
from table2 t2
where not exists (select 1 from table1 t1 where t1.key = t2.key);
Actually changing table2 is more cumbersome -- and depends on the database you are using. One method is an insert and update:
update table2
set col1 = (select t1.col1 from table1 t1 where t1.key = t2.key),
col2 = (select t1.col2 from table1 t1 where t1.key = t2.key)
where col1 <> (select t1.col1 from table1 t1 where t1.key = t2.key) or
col2 <> (select t1.col2 from table1 t1 where t1.key = t2.key);
insert into table2 (key, col1, col2)
select t1.key, t2.key, t3.key
from table1 t1
where not exists (select 1 from table2 t2 where t2.key = t1.key);
Specific databases have many methods for simplifying this logic, including on conflict, on duplicate key update, and merge commands.

Related

Find values where related must have list of values

I'm trying to find a simple solution for my SQL Server problem.
I have two tables look like this:
table1
--id
-- data
table2
--id
--table1_id
--value
I have some records like this:
Table1
+-----------------------+
| id | data |
+-----------------------+
| 1 | ? |
+-----------------------+
| 2 | ? |
+-----------------------+
Table2
+-----------------------+
|id | table1_id | value |
+-----------------------+
| 1 | 1 | 'a' |
+-----------------------+
| 2 | 1 | 'b' |
+-----------------------+
| 3 | 2 | 'a' |
+-----------------------+
Now I want to get table1 with all it's additional values where the relation to table2 has 'a' AND 'b' as values.
So I would get the id 1 of table1.
Currently I have an query like this:
SELECT t1.[id], t1.[data]
FROM [table1] t1,
(SELECT [id]
FROM [table1] t1
JOIN [table2] t2 ON t1.[id] = t2.[table1_id] AND t2.[Value] IN('a', 'b')
GROUP BY t1[id]
HAVING COUNT(t2.[Value]) = 2) x
WHERE t1.id = x.id
Has anyone an idea on how to achieve my goal in a simpler way?
One way uses exists:
select t1.*
from table1 t1
where exists (select 1
from table2 t2
where t2.table1_id = t1.id and t2.value = 'a'
) and
exists (select 1
from table2 t2
where t2.table1_id = t1.id and t2.value = 'b'
);
This can take advantage of an index on table2(table1_id, value).
You could also write:
select t1.*
from table1 t1
where (select count(distinct t2.value)
from table2 t2
where t2.table1_id = t1.id and t2.value in ('a', 'b')
) = 2 ;
This would probably also have very good performance with the index, if table2 doesn't have duplicates.
SELECT T1.[id], T1.[data]
FROM table1 AS T1
JOIN table2 AS T2
ON T1.[id]=T2.[table1_id]
JOIN table2 AS T3
ON T1.[id]=T3.[table1_id]
WHERE
T2.[Value] ='a'
AND T3.[Value] = 'b'
As Gordon Linoff suggested, exists clause usage works as well and could be performance efficient depending on the data you are playing with.
you have to do several steps to solve the problem:
established which records are related to table 1 and table 2 and which of these are of value (A or B) and eliminate the repeated ones with the group by(InfoRelationate )
validate that only those related to a and b were allowed by means of a count in the table above (ValidateAYB)
see what data meets the condition of table1 and table 2 and joined table 1
this query meets the conditions
with InfoRelationate as
(
select Table2.table1_id,value
from Table2 inner join
Table1 on Table2.table1_id=Table1.id and Table2.value IN('a', 'b')
group by Table2.table1_id,value
),
ValidateAYB as
(
select InfoRelationate.table1_id
from InfoRelationate
group by InfoRelationate.table1_id
having count (1)=2
)
select InfoRelationate.table1_id,InfoRelationate.value
from InfoRelationate
inner join ValidateAYB on InfoRelationate.table1_id=ValidateAYB.table1_id
union all
select id,data
from Table1
Example code

Take data from two tables and show in one row without duplicates with a where condition

I want to take the data from two tables and output them in one row .
output will have two columns "to" and "from" where the condition is "from" will be having data from second table where type is true and "to" column will have data from second table where type is false . FK_ID in second table is linked to ID on the first table . Please help with the query.
I was trying to do with inner joins and union was not able to make it work . Thanks in advance .
TABLE 1
ID | PATH|
1 | ABC |
2 | EFG |
TABLE 2
ID | FK_ID | NUMBER | TYPE
20 | 1 | 123 | TRUE
21 | 1 | 456 | FALSE
28 | 2 | 888 | FALSE
29 | 2 | 939 | TRUE
OUTPUT SHOULD BE:
ID | PATH | TO | FROM
1 | ABC | 456 | 123
2 | EFG | 888 | 939
Use aggregation with pivoting logic to identify the "to" and "from" components of each path:
SELECT
t1.ID,
t1.PATH,
MAX(CASE WHEN t2.TYPE = 'FALSE' THEN t2.NUMBER END) AS "TO",
MAX(CASE WHEN t2.TYPE = 'TRUE' THEN t2.NUMBER END) AS "FROM"
FROM table1 t1
LEFT JOIN table2 t2
ON t1.ID = t2.FK_ID
GROUP BY
t1.ID,
t1.PATH
ORDER BY
t1.ID;
If performance is an issue, you might find a lateral join to be faster:
SELECT t1.*, t2.*
FROM table1 t1 LEFT JOIN LATERAL
(SELECT SUM(T2.NUMBER) FILTER (WHERE NOT t2.TYPE) as num_to,
SUM(T2.NUMBER) FILTER (WHERE t2.TYPE) as num_from
FROM table2 t2
WHERE t1.ID = t2.FK_ID
) t2
ORDER BY t1.ID;
This avoids the outer GROUP BY and probably the sorting as well (assuming that ID is the primary key).
It also assumes that TYPE is a Postgres boolean type. If not, use string comparisons for the WHERE clauses.

What is the correct way from performance perspective to match(replace) every value in every row in temp table using SQL Server 2016 or 2017?

I am wondering what should I use in SQL Server 2016 or 2017 (CTE, LOOP, JOINS, CURSOR, REPLACE, etc) to match (replace) every value in every row in temp table? What is the best solution from performance perspective?
Source Table
|id |id2|
| 1 | 2 |
| 2 | 1 |
| 1 | 1 |
| 2 | 2 |
Mapping Table
|id |newid|
| 1 | 3 |
| 2 | 4 |
Expected result
|id |id2|
| 3 | 4 |
| 4 | 3 |
| 3 | 3 |
| 4 | 4 |
You may join the second table to the first table twice:
WITH cte AS (
SELECT
t1.id AS id_old,
t1.id2 AS id2_old,
t2a.newid AS id_new,
t2b.newid AS id2_new
FROM table1 t1
LEFT JOIN table2 t2a
ON t1.id = t2a.id
LEFT JOIN table2 t2b
ON t1.id2 = t2b.id
)
UPDATE cte
SET
id_old = id_new,
id2_old = id2_new;
Demo
Not sure if you want just a select here, or maybe an update, or an insert into another table. In any case, the core logic I gave above should work for all these cases.
You'd need to apply joins on update query. Something like this:
Update tblA set column1 = 'something', column2 = 'something'
from actualName tblA
inner join MappingTable tblB
on tblA.ID = tblB.ID
this query will compare eachrow with ids and if matched then it will update/replace the value of the column as you desire. :)
Do the self join only
SELECT t1.id2 as id, t2.id2
FROM table1 t
INNER JOIN table2 t1 on t1.id = t.id
INNER JOIN table2 t2 on t2.id = t.id2
This may have best performance from solutions posted here if you have indexes set appropriately:
select (select [newid] from MappingTable where id = [ST].[id]) [id],
(select [newid] from MappingTable where id = [ST].[id2]) [id2]
from SourecTable [ST]

Update multiple rows using select statements

Let's say I have these tables and values:
Table1
------------------------
ID | Value
------------------------
2 | asdf
4 | fdsa
5 | aaaa
Table2
------------------------
ID | Value
------------------------
2 | bbbb
4 | bbbb
5 | bbbb
I want to update all the values in Table2 using the values in Table1 with their respective ID's.
I know I can run this:
UPDATE Table2
SET Value = t1.Value
FROM Table2 t2
INNER JOIN Table1 t1 on t1.ID = t2.ID
But what can I do if Table1 and Table2 are actually select statements with criteria? How can I modify the SQL statement to take that into consideration?
This is how such update queries are generally done in Oracle. Oracle doesn't have an UPDATE FROM option:
UPDATE table2 t2
SET t2.value = ( SELECT t1.value FROM table1 t1
WHERE t1.ID = t2.ID )
WHERE EXISTS ( SELECT 1 FROM table1 t1
WHERE t1.ID = t2.ID );
The WHERE EXISTS clause will make sure that only the rows with a corresponding row in table1 are updated (otherwise every row in table2 will be updated; those without corresponding rows in table1 will be updated to NULL).

select from multiple tables, when one is never used

I improved my question with example tables for a better understanding
I have 3 tables with following rows:
TABLE1 t1 TABLE t2 TABLE t3
ID NAME OBS ID HW_VER ID SERIAL
----------------- ----------- ------------
1 | Name1 | Obs1 1 | HWVer1 5 | Serial5
2 | Name2 | Obs2 2 | HWVer2 6 | Serial6
3 | Name3 | Obs3 3 | HWVer3 7 | Serial7
4 | Name4 | Obs4
5 | Name5 | Obs5
6 | Name6 | Obs6
7 | Name7 | Obs7
Now, I want to select the id, name and obs when 2 conditions are fulfilled:
the id is present in t2 or t3 (never in both);
it refers to either t2 or t3 attributes (eg. t2.HW_VER='HWVER1'), never on both
I did something like this, but it's wrong:
SELECT DISTINCT t1.id, t1.name, t1.obs
FROM table1 t1, table2 t2, table3 t3
WHERE t1.id IN (t2.id, t3.id) AND t3.serial='Serial6';
I cannot use unions, external tables or views for this.
Please let me know in case of further questions.
Thanks a lot for your answers, I really appreciate your time..
You need to select from T2 OR T3 but never both? I think you want something like this
select count(*)
from t1
where exists (
select 'x'
from t2
where MyPrimaryKey_Name = 'random_name'
and t2.id = t1.id
)
or exists (
select 'x'
from t3
where MyPrimaryKey_Name = 'random_name'
and t3.id = t1.id
)