Splitting up data from one table to another - sql

Currently I have a table like shown in T1 that I'd like to split up into individual Person table like T2 and a PersonClientLink table like T3. Each relationship is stored as its own unique ID in T1. This makes duplicates of the same person and their clients which is making this difficult. I need to group the Persons from T1 into a single ID and basically turn it into T2 and create links for each relationship they have into T3. T3 Currently exists but is empty, I want to populate T3 based on the data in T1 first and then turn T1 into T2. Any advice would be appreciated :)
T1 - Current mess
ID
Client_Id
Name
1
5
Bob
2
6
Bob
3
7
Greg
4
8
Greg
T2 - Person
ID
Name
1
Bob
2
Greg
T3 - PersonClientLink
ID
Person_Id
Client_Id
1
1
5
2
1
6
3
2
7
4
2
8
I'm honestly clueless no idea where to begin with this...

I would create the T2 and T3 tables and populate them from T1. I would then drop T1 afterwards. The following is done with postgresql, but it should be the same. A serial column in postgresql is the same as an identity column in sqlserver.
create table t2 (id serial primary key,
name varchar(32) not null);
create table t3 (id serial primary key,
person_id integer not null,
client_id integer not null);
Create a foreign key from t3.person_id to t2.id
Populate t2 and t3:
insert into t2 (name) (select distinct name from t1);
insert into t3 (person_id,client_id) (
select t2.id,t1.client_id from t1,t2 where t2.name = t1.name);
Convert the join of the tables to ANSI style if sqlserver requires that.
The initial data (same as yours) and the results:
select * from t1;
id | client_id | name
----+-----------+------
1 | 5 | bob
2 | 6 | bob
3 | 7 | greg
4 | 6 | greg
(4 rows)
select * from t2;
id | name
----+------
1 | bob
2 | greg
(2 rows)
select * from t3;
id | person_id | client_id
----+-----------+-----------
1 | 1 | 6
2 | 1 | 5
3 | 2 | 6
4 | 2 | 7
(4 rows)
I hope this helps.

Related

Update records in SQL by looking up in different table

I am copying data from few tables in SQL server A to B. I have a set of staging tables in B and need to update some of those staging tables based on updated values in final target table in B.
Example:
Server B:
StagingTable1:
ID | NAME | CITY
1 ABC XYZ
2 BCD XXX
StagingTable2:
ID | AGE | Table1ID(FK)
10 15 1
20 16 2
After Copying StagingTable1 to TargetTable1 (ID's get auto polulated and I get new ID's, now ID 1 becomes 2 and ID 2 becomes 3)
TargetTable1:
ID | NAME | CITY
1 PQR YYY (pre-existing record)
2 ABC XYZ
3 BCD XXX
So now before I can copy the StagingTable2 I need to update the Table1ID column in it by correct values from TargetTable1.
StagingTable2 should become:
ID | AGE | Table1ID(FK)
10 15 2
20 16 3
I am writing a stored procedure for this and not sure how do I lookup and update the records in staging tables?
Assuming that (name, city) tuples are unique in StagingTable1 and TargetTable1, you can use an updatable common table expression to generate the new mapping and assign the corresponding values:
with cte as (
select st2.Table1ID, tt1.id
from StagingTable2 st2
inner join StagingTable1 st1 on st1.ID = st2.Table1ID
inner join TargetTable1 tt1 on tt1.name = st1.name and tt1.city = st1.city
)
update cte set Table1ID = id
Demo on DB Fiddle - content of StagingTable2 after the update:
id | age | Table1ID
-: | --: | -------:
10 | 15 | 2
20 | 16 | 3

Insert records from 1st table to 2nd table only when the record is not present in the 2nd table

I have 1 table with the same table structure as the second table, I just have to insert records from table1 to table2 with
insert into table2(select * from table1);
The table 2 has a primary key in one of the fields say(id), and some one inserted data corresponding to that primary key
table1 table2
id | name id | name
1 | new1 1 | old1
2 | new2 4 | new4
3 | new3 3 | old3
5 | new5 6 | old6
I have to insert only those records into table 2 for which the primary key is not populated.
After insertion table 2 should look like this
table 2
id | name
1 | old1
2 | new2
3 | old3
4 | new4
5 | new5
6 | old6
What is the easiest way to do this?
Use a NOT EXISTS condition to get only those rows from table1 that don't exists in table2:
insert into table2 (id, name)
select t1.id, t1.name
from table1 t1
where not exists (select *
from table2 t2
where t2.id = t1.id);
You could use MERGE statement with only INSERT WHEN NOT MATCHED:
MERGE INTO table2 t2
USING table1 t1
ON t1.id = t2.id
WHEN NOT MATCHED
THEN
INSERT INTO table2
(id, name)
VALUES
(t1.id, t1.name)

SQL Join On Columns of Different Length

I'm trying to join two tables together in SQL where the columns contain a different number of unique entries.
When I use a full join the additional entries in the column joined on are missing.
The code I'm using is (in a SAS proc SQL):
proc sql;
create table table3 as
select table1.*, table2.*
from table1 full join table2
on table1.id = table2.id;
quit;
Visual example of problem (can't show actual tables as contain sensitive data)
Table 1
id | count1
1 | 2
2 | 3
3 | 2
Table 2
id | count2
1 | 4
2 | 5
3 | 6
4 | 2
Table 3
id | counta | countb
1 | 2 | 4
2 | 3 | 5
3 | 2 | 6
- | - | 2 <----- I want don't want the id column to be blank in this row
I hope I've explained my problem clearly enough, thanks in advance for your help.
The id from table 1 is blank because the row from table2 has no match in table 1. Try looking at the output from this query:
select coalesce(table1.id, table2.id) as id, table1.count1, table2.count2
from table1 full join table2
on table1.id = table2.id;
Coalesce works from left to right returning the first non null value (it can take more than 2 arguments). If the id in table 1 is null it uses the id from table 2 instead
I recommend also to alias all tables in queries, so I’d have written this:
SELECT
COALESCE(t1.id, t2.id) as id,
t1.count1,
t2.count2
FROM
table1 t1
FULL OUTER JOIN
table2 t2
ON
t1.id = t2.id;
Simply select coalesce(t1.id, t2.id), will return the first non-null id value.

Find missing records in many-to-many junction table

I have a many-to-many junction table that joins 2 tables:
table1
table1_id
-----------
1
2
3
table2
table2_id
------------
A
B
C
join_table
table1_id | table2_id
------------|--------------
1 | A
1 | B
1 | C
2 | A
2 | C
3 | B
How can I write a single query to find out what records are missing from the join table, since I want all combinations accounted for. Basically, I want a query that returns these missing records:
table1_id | table2_id
---------------------------
2 | B
3 | A
3 | C
I feel like this should be simple but I haven't figured it out.
Do a CROSS JOIN to get all possible combinations. Then use EXCEPT to remove the existing combinations.
select t1.table1_id, t2.table2_id
from t1 cross join t2
except
select table1_id, table2_id from join_table

Delete subtable from a table, SQL

What is the most clever way to delete tuples from table1, that are in the second table,
if the second table is not a part of initial database, but a result of some really big query?
table1 *this table is a result of some query
------------- -------------
| id1 | id2 | | id1 | id2 |
------------- -------------
| 1 2 | | 5 6 |
| 3 4 | | 1 2 |
| 5 6 | | 11 12 |
| 7 8 | -------------
| 9 10 |
| 11 12 |
| 13 14 |
-------------
I came up with
delete from table1
where id1 in (select id1 from ( really long query to get a second table))
and id2 in (select id2 from (the same really long query to get a second table));
It works, but I feel like I'm doing it way too wrong, and not keeping the query DRY.
And would the way you suggest work the same if table1 had an additional column, for example "somecol"?
IMO, You can use EXISTS statement like this:
DELETE FROM table1
WHERE EXISTS (
SELECT 1
FROM (<your long query>) AS dt
WHERE table1.id1 = dt.id1
AND table1.id2 = dt.id2);
[SQL Fiddle Sample]
One method is to use with, delete and exists:
with secondtable as (
<your query here>
)
delete from table1
where exists (select 1
from secondtable st
where table1.id1 = st.id1 and table1.id2 = st.id2
);
A Correlated Subquery using EXISTS allows matching multiple columns:
delete
from table1
where exists
( select * from
(
"really long query"
) as t2
where table1.id1 = t2.id1 -- correlating inner and outer table
and table1.id2 = t2.id2 -- similar to a join-condition
)