Merge from 1 table to another - sql

I have the following two tables:
table 1 table 2
+-------------+------+ +-------------+------+
| ssn | id | | ssn | id |
+-------------+------+ +-------------+------+
| 123456789 | 123 | | 123456789 | k12 |
| 123456789 | 456 | | 999999999 | k11 |
| 123456789 | 789 | +-------------+------+
| 123456789 | k12 |
| 999999999 | 799 |
+-------------+------+
What I want to do is to merge the data in table 2 with the data in table 1 if there is no matching id. So 123456789 should be ignored as the member already shows with the id k12. Record 999999999 k11 should be added to table 1.

A few ways to do this. Here's one using NOT EXISTS:
INSERT INTO Table1
SELECT T2.ssn, T2.id
FROM Table2 T2
WHERE NOT EXISTS (
SELECT 1
FROM Table1 T1
WHERE T1.id = T2.id)
Or you could use NOT IN:
INSERT INTO Table1
SELECT ssn, id
FROM Table2
WHERE id NOT IN (SELECT id FROM Table1)

INSERT INTO #Table1
( ssn, id)
SELECT t2.ssn, t2.id
FROM #table2 as t2
LEFT JOIN #table1 as t1
ON t1.id = t2.id or t1.ssn = t2.ssn
WHERE t1.id IS NULL;
if you are worried about duplicates in either id OR ssn, this will only insert when both are unique

If this is something you need to do often on large tables, this LEFT JOIN approach may perform a bit faster.
INSERT INTO Table1
SELECT T2.ssn, T2.id
FROM Table2 T2
LEFT JOIN Table1 t1 ON t1.id = t2.id
WHERE t1.id IS NULL

Related

How to find rows that have only one value in another table

I want to find rows from table 1 that are joining with table 2 and have all rows same in table 2.
Example:
Row with id 4 in table2 is not valid because have different values in table1 (value1, value)
Row with id 5 in table2 is valid because have same values in table1 (value3)
table1
+----+--------+----------+
| id |table2Id| value |
+----+--------+----------+
| 1 | 4 |value1 |
| 2 | 4 |value2 |
| 3 | 5 |value3 |
| 4 | 5 |value3 |
| 5 | 5 |value3 |
+----+--------+----------+
table2
+----+
| id |
+----+
| 4 |
| 5 |
+----+
I'm not sure why a join is necessary. You can just use the information in table1:
select t1.*
from (select t1.*,
min(value) over (partition by table2id) as min_value,
max(value) over (partition by table2id) as max_value
from table1 t1
) t1
where min_value = max_value;
Or use not exists:
select t1.*
from table1 t1
where not exists (select 1
from table1 tt1
where tt1.table2id = t1.table2id and
tt1.value = t1.value
);
In either case, you can join to table2 if you need to for filtering or for other columns, but based on the information in your question the join is not needed.
You can do the following:
SELECT * FROM (
SELECT max(table2Id) as Id,value
FROM table1
GROUP BY value
HAVING COUNT(value)=1) JOIN table2 on table1.Id = table2.Id
Thanks
Try this -
Select t1.* from table1 t1 JOIN table2 t2 on t1.id = t2.id and t1.table2Id = t2.id

Join multiple tables using SQL & T-SQL

Unfortunately, I cannot be sure that the name of my question is correct here.
Example of initial data:
Table 1 Table 2 Table 3
| ID | Name | | ID | Info1 | | ID | Info2 |
|----|-------| |----|-------| |----|-------|
| 1 | Name1 | | 1 | text1 | | 1 | text1 |
| 2 | Name2 | | 1 | text1 | | 1 | text1 |
| 3 | Name3 | | 2 | text2 | | 1 | text1 |
| 2 | text2 | | 2 | text2 |
| 3 | text3 |
In my initial data I have relationship between 3 tables by field ID.
I need to join table2 and table3 to the first table, but if I do sequential join, like left join table2 and left join table3 by ID I will get additional records on second join, because there will be several records with one ID after first join.
I need to get records of table2 and table3 like a list in each column for ID of first table.
Here an example of expected result:
Table 3
| ID | Name |Info1(Table2)|Info2(Table3)|
|-------|-----------|-------------|-------------|
| 1 | Name1 | text1 | text1 |
| 1 | Name1 | text1 | text1 |
| 1 | Name1 | null | text1 |
| 2 | Name2 | text2 | text2 |
| 2 | Name2 | text2 | null |
| 3 | Name3 | null | text3 |
This is the method I would use, however, the table design you have could probably be improved on; why are Table2 and Table3 separate in the first place?
USE Sandbox;
GO
CREATE TABLE dbo.Table1 (ID int, [Name] varchar(5))
INSERT INTO dbo.Table1 (ID,
[Name])
VALUES(1,'Name1'),
(2,'Name1'),
(3,'Name3');
CREATE TABLE dbo.Table2 (Id int,Info1 varchar(5));
CREATE TABLE dbo.Table3 (Id int,Info2 varchar(5));
INSERT INTO dbo.Table2 (Id,
Info1)
VALUES(1,'text1'),
(1,'text1'),
(2,'text2'),
(2,'text2');
INSERT INTO dbo.Table3 (Id,
Info2)
VALUES(1,'text1'),
(1,'text1'),
(1,'text1'),
(2,'text2'),
(3,'text3');
WITH T2 AS(
SELECT ID,
Info1,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY (SELECT NULL)) AS RN --SELECT NULL as you have no other columns to actually create an order
FROM Table2),
T3 AS(
SELECT ID,
Info2,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY (SELECT NULL)) AS RN
FROM Table3),
Tally AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I --Assuming you have 10 or less matching items
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) N(N))
SELECT T1.ID,
T1.[Name],
T2.info1,
T3.info2
FROM Table1 T1
CROSS JOIN Tally T
LEFT JOIN T2 ON T1.ID = T2.ID AND T.I = T2.RN
LEFT JOIN T3 ON T1.ID = T3.ID AND T.I = T3.RN
WHERE T2.ID IS NOT NULL OR T3.ID IS NOT NULL
ORDER BY T1.ID, T.I;
GO
DROP TABLE dbo.Table1;
DROP TABLE dbo.Table2;
DROP TABLE dbo.Table3;
If you have more than 10 rows, then you could build a "proper" tally table on the fly, or create a physical one. One on the fly is probably going to be a better idea though, as I doubt you're going to have 100's of matching rows.

SQL Query to Bring Back where Row Count is Greater than 1

I have two tables.They have the same data but from different sources. I would like to find all columns from both tables that where id in table 2 occurs more than once in table 1. Another way to look at it is if table2.id occurs only once in table1.id dont bring it back.
I have been thinking it would be some combination of group by and order by clause that can get this done but its not getting the right results. How would you express this in a SQL query?
Table1
| id | info | state | date |
| 1 | 123 | TX | 12-DEC-09 |
| 1 | 123 | NM | 12-DEC-09 |
| 2 | 789 | NY | 14-DEC-09 |
Table2
| id | info | state | date |
| 1 | 789 | TX | 14-DEC-09 |
| 2 | 789 | NY | 14-DEC-09 |
Output
|table2.id| table2.info | table2.state| table2.date|table1.id|table1.info|table1.state|table1.date|
| 1 | 789 | TX | 14-DEC-09 | 1 | 123 | TX | 12-DEC-09 |
| 1 | 789 | TX | 14-DEC-09 || 1 | 123 | NM | 12-DEC-09 |
If you using MSSQL try using a Common Table Expression
WITH cte AS (SELECT T1.ID, COUNT(*) as Num FROM Table1 T1
INNER JOIN Table2 T2 ON T1.ID = T2.ID
GROUP BY T1.ID
HAVING COUNT(*) > 1)
SELECT * FROM cte
INNER JOIN Table1 T1 ON cte.ID = T1.ID
INNER JOIN Table2 T2 ON cte.ID = T2.ID
First, I would suggest adding an auto-incrementing column to your tables to make queries like this much easier to write (you still keep your ID as you have it now for relational-mapping). For example:
Table 1:
TableID int
ID int
Info int
State varchar
Date date
Table 2:
TableID int
ID int
Info int
State varchar
Date date
Then your query would be really easy, no need to group, use CTEs, or row_over partitioning:
SELECT *
FROM Table2 T2
JOIN Table1 T1
ON T2.ID = T1.ID
JOIN Table1 T1Duplicate
ON T2.ID = ID
AND T1.TableID <> T1Duplicate.TableID
It's a lot easier to read. Furthermore, there are lots of scenarios where an auto-incrementing ID field is benefitial.
I find this a much simpler way to do it:
select TableA.*,TableB.*
from TableA
inner join TableB
on TableA.id=TableB.id
where TableA.id in
(select distinct id
from TableA
group by id
having count(*) > 1)

Oracle SQL Query with joining four tables to get only one row of multiple entries

My tables are set up something like this:
Table 1 Table 2 Table 3
+-------+-----+ +-------+-------+-------+-----+ +-------+-----+
| ID | ... | | ID | T1_ID | T3_ID | ... | | ID | ... |
+-------+-----+ +-------+-------|-------|-----| |-------|-----|
| 101 | ... | | 202 | 101 | 301 | ... | | 300 | ... |
| 102 | ... | | 203 | 101 | 302 | ... | | 302 | ... |
| 104 | ... | | 204 | 101 | 302 | ... | | 314 | ... |
+-------+-----+ | 205 | 101 | 302 | ... | +-------+-----+
| 206 | 104 | 327 | ... |
+-------+-------+-------+-----+
I want to construct a subquery statement that will select only one row of table 2 for an given id of table 1, if table2.t3_id exists in table 3.
The important point is that there maybe exist multiple rows with same t3_id in table 2. So, the foreign key relation between table 2 and table 3 is not unique or does not exist at all.
My idea was the following statement:
inner join
(
SELECT *
FROM (
SELECT t3_id, t1_id, id
FROM table2
WHERE EXISTS
(
SELECT id
FROM table3
)
)
WHERE ROWNUM=1
) tb2 ON tb1.id = tb2.t1_id
This statement returns multiple rows, but I only need one.
How do I do this?
Not tested but should do what you need
SELECT *
FROM table1 t1 JOIN table2 t2
ON ( t1.id = t2.t1_id
AND EXISTS ( SELECT 'x'
FROM table3 t3
WHERE t2.t3_id = t3.id
)
AND NOT EXISTS ( SELECT 'a'
FROM table2 t22
WHERE t22.t1_id = t2.t1_id
AND t22.id < t2.id
)
)
You can get one row of multiple entries by using row_number() to enumerate them and then selecting just one value. Here is an example:
select . . .
from table1 t1 join
(select t2.*, row_number() over (partition by t.id order by t2.id) as seqnum
from table2 t2
) t2
on t2.t1_id = t1.id and t2.seqnum = 1;
EDIT:
For all three tables, you want to do the row_number() all the joins:
select . . .
from (select . . ., row_number() over (partition by t1_id order by id) as seqnum
from table1 t1 join
table2 t2
on t2.t1_id = t1.id join
table3 t3
on t2.t3_id = t3.id
) t
where seqnum = 1;

MySQL get data from another table with duplicate ID/data

How to query data from table_1 which ID is not available on table_2 that has duplicate ID's. See example below.
I want to get ID 5 and 6 of Table 1 from Table 2
Table 1
-------------
| ID | Name |
| 1 | a |
| 2 | b |
| 3 | c |
| 4 | d |
| 5 | e |
| 6 | f |
-------------
Table 2
-------------
Table 1 ID |
| 1 |
| 1 |
| 2 |
| 2 |
| 2 |
| 3 |
| 4 |
-------------
Thanks!
Minus query would be very helpful, see this link: minus query replacement
for your data this would look like this:
SELECT table_1.id FROM table_1 LEFT JOIN table_2 ON table_2.id = table_1.id WHERE table_2.id IS NULL
Use:
SELECT t.id
FROM TABLE_1 t1
LEFT JOIN TABLE_2 t2 ON t2.id = t1.id
WHERE t2.id IS NULL
Using NOT EXISTS:
SELECT t.id
FROM TABLE_1 t1
WHERE NOT EXISTS(SELECT NULL
FROM TABLE_2 t2
WHERE t2.id = t1.id)
Using NOT IN:
SELECT t.id
FROM TABLE_1 t1
WHERE t1.id NOT IN (SELECT t2.id
FROM TABLE_2 t2)
Because there shouldn't be NULL values in table2's id column, the LEFT JOIN/IS NULL is the fastest means: http://explainextended.com/2009/09/18/not-in-vs-not-exists-vs-left-join-is-null-mysql/
If I am understanding you correctly you want to do an outer join. In this case it would be:
SELECT * FROM
table_1 LEFT JOIN ON table_2
ON table_1.id = table_2.id
WHERE table_2.id is NULL
This one does what you want:
Select t1.id
From table1 t1
Left Join table2 t2
On t2.id = t1.id
Where t2.id Is Null
Result:
id
--
5
6