Assign lowest record ID with Outer Apply - sql

Assume I have those tables:
CREATE TABLE Employee (ID int, EmployeeIdentifier varchar(100),ManagerIdentifier varchar(100))
CREATE TABLE EmployeeManager (ID int, EmployeeID varchar(100))
INSERT Employee
VALUES
(1,'apple','apple'),
(2,'banana','apple'),
(3,'citrus','apple'),
(4,'grape','grape'),
(5,'grape','grape'),
(6,'grape','grape')
INSERT EmployeeManager
VALUES
(1,1),
(2,1),
(3,1),
(4,4),
(5,5),
(6,5)
For Employee.ID IN (1,2,3), records in EmployeeManager look fine.
But in Employee.ID IN (4,5,6) we can see many duplicates. We are not allowed to delete any records from Employee table. But we are free to assign EmpoyeeManager.EmployeeID value. Since there is only one Actual record for Grape and the rest is duplicate, I want to assign EmpoyeeManager.EmployeeID to a minimum value Employee.ID from all duplicated grape records in Employee table, aka to 4.
I have this query,
UPDATE d SET EmployeeID = l.ID
FROM dbo.EmployeeManager d
INNER JOIN Employee s on d.ID=s.ID
OUTER APPLY (
SELECT ID
FROM Employee l
WHERE s.ManagerIdentifier=l.EmployeeIdentifier
) l
WHERE
EXISTS (
SELECT d.EmployeeID
EXCEPT
SELECT l.ID
)
If you keep running it you will see that EmployeeManager.EmployeeID values for ID (4,5,6) will keep changing.
How I can I update above update statement to assign to the lowest value of Employee.ID for all EmployeeManager.ID (4,5,6), aka to 4?
We are not allowed to run one time fix script, because corrupted data to above table can keep coming.
Desired output after running above update statement should be

You need TOP (1) and ORDER BY in the subquery to pick out a specific row
UPDATE d SET EmployeeID = l.ID
FROM dbo.EmployeeManager d
INNER JOIN Employee s on d.ID=s.ID
OUTER APPLY (
SELECT TOP (1) ID
FROM Employee l
WHERE s.ManagerIdentifier = l.EmployeeIdentifier
ORDER BY ID
) l
WHERE
EXISTS (
SELECT d.EmployeeID
EXCEPT
SELECT l.ID
)
You appear to have a normalization issue, as the Manager is defined in two places
I suggest you use better aliases for your tables, they are not very memorable
You can change your OUTER to CROSS, and then you can use a standard <> instead of the EXISTS/EXCEPT
CROSS APPLY (
SELECT TOP (1) ID
FROM Employee l
WHERE s.ManagerIdentifier = l.EmployeeIdentifier
ORDER BY ID
) l
WHERE d.EmployeeID <> l.ID

Related

How to insert data in multiple rows of temp tables in sql

How I can insert in same row for example I want to insert all these columns data in first row then second and so on. But my query is inserting data when customer name data is complete, status data is inserted after one row of customer number last data.
CREATE TABLE #tblCustomer
(
CustomerNumber NVARCHAR(1000),
Status NVARCHAR (1000),
CustomerType NVARCHAR (1000)
)
INSERT
INTO #tblCustomer (CustomerNumber)
Select c.CustomerNumber
From Customer.Customer c
INSERT
INTO #tblCustomer (Status)
Select ses.Name
From Customer.Customer c
Left Outer Join COM.StatusEngine_EntityStatus sees
On c.Status = sees.EntityStatusId
And sees.EntityId = 'CustomerStatus'
Join COM.StatusEngine_Status ses
On sees.Status = ses.Status
INSERT
INTO #tblCustomer (CustomerType)
select t.Description
From Customer.Customer c
Join Customer.Type t
On c.TypeId = t.pkTypeId
Receiving output:
0001 null null
0002 null null
NULL active null
NULL active null
NULL null individual
NULL null individual
Expected Output:
0001 active individual
0002 active individual
Without knowing more about your tables, you can insert the first records like so...
INSERT INTO #tblCustomer (CustomerNumber)
select c.CustomerNumber from Customer.Customer c
And then update the remaining columns this way...
UPDATE #tblCustomer
set #tblCustomer.Status = c.Status
from Customer.Customer c
left outer join COM.StatusEngine_EntityStatus sees
on c.Status = sees.EntityStatusId and sees.EntityId = 'CustomerStatus'
join COM.StatusEngine_Status ses
on sees.Status = ses.Status
join #tblCustomer temp
on c.CustomerNumber = temp.CustomerNumber
However doing it like this is really inefficient, you should strive to create an insert that updates all columns in one go.
You can do it like this (I have verified the code with the Northwind sample database from Microsoft - I have chosen that one since you can use it for each SQL server version since SQL 2000):
declare #NumberOfItems int = 10;
CREATE TABLE #tblCustomer (
CustomerNumber NVARCHAR(1000)
,Name NVARCHAR (1000)
,CustomerType NVARCHAR (1000))
insert into #tblCustomer
select CustomerNumber, Name, Status from (select top(#NumberOfItems) ROW_NUMBER() OVER(ORDER BY CustomerID) as No, CustomerID as CustomerNumber from Customers) c
left join (select * from (select top(#NumberOfItems) ROW_NUMBER() OVER(ORDER BY ContactName) as No, ContactName as Name from Customers) q2) j1 on c.No=j1.No
left join (select * from (select top(#NumberOfItems) ROW_NUMBER() OVER(ORDER BY ContactTitle) as No, ContactTitle as Status from Customers) q3) j2 on c.No=j2.No
select * from #tblCustomer
drop table #tblCustomer
It will create a column with numbers from 1 to n for each element you want to import and then it joins it together.
The result of this query is:
Note: While this works, it is not the preferred way to do it, because there is no primary key - normally one would look for primary key / foreign key relationships to join the data together. The way you're intending to fill it puts data together which doesn't necessarily belong together (here each column is sorted and then put together by its row number - i.e. it picks values from rows sorted by its extract column and then putting them together again). If you have no primary key because you're importing data from other sources, you can add WHERE clauses to create a better connection between the inner and the outer select statements - you can find a nice article which might help you with such kind of subqueries here.
This is untested, however, I believe this is what you're after:
INSERT INTO #tblCustomer (CustomerNumber, [Status], CustomerType))
SELECT c.CustomerNumber, ses.[Name], t.[Description]
FROM Customer.Customer c
JOIN COM.StatusEngine_EntityStatus sees ON c.Status = sees.EntityStatusId --Changed to JOIN, as it is turned into a implicit INNER join by the next JOIN
AND sees.EntityId = 'CustomerStatus'
JOIN COM.StatusEngine_Status ses ON sees.[Status] = ses.[Status];
Note my comment regarding your LEFT OUTER JOIN, in that I've changed it to an INNER JOIN.
straight forward SQL here:
CREATE TABLE #tblCustomer
(
CustomerNumber NVARCHAR(1000),
Status NVARCHAR (1000),
CustomerType NVARCHAR (1000)
)
INSERT INTO #tblCustomer (CustomerNumber, Status, CustomerType)
SELECT DISTINCT
c.CustomerNumber,
ses.Name,
t.Description
FROM Customer.Customer c
LEFT OUTER JOIN COM.StatusEngine_EntityStatus sees
On c.Status = sees.EntityStatusId
And sees.EntityId = 'CustomerStatus'
LEFT OUTER JOIN COM.StatusEngine_Status ses
On sees.Status = ses.Status
LEFT OUTER JOIN Customer.Type t
On c.TypeId = t.pkTypeId

Automatic SQL Calculate

I have a test table:
PiggyBank_Current
- Name (Primary Key)
-- Jackson
- Value
-- 0
PiggyBank_Default
- Name
-- Jackson
- value
-- 100
PiggyBank_Earn
- Name
-- Jackson
- Value
-- 20
Is it possible that everytime I add a new Jackson earning record to PiggyBank_Earn that it automatic calculate all Jackson earning then add it with the default. The total will then replace the value on table PiggyBank_Current value that the person name is equal to Jackson? So for this example it would be 120 total.
You can use trigger for that purpose. I used id instead of [name] to join tables:
CREATE TRIGGER PiggyBank_Earn_Trigger
ON PiggyBank_Earn
AFTER INSERT, UPDATE
AS
;WITH cte AS (
SELECT id,
SUM([val]) as [value]
FROM (
SELECT d.id,
SUM(d.[value]) as [val]
FROM inserted i
INNER JOIN PiggyBank_Default d
ON i.id = d.id
GROUP BY d.id
UNION ALL
SELECT e.id,
SUM(i.[value]) as [val]
FROM inserted i
INNER JOIN PiggyBank_Earn e
ON i.id = e.id
GROUP BY e.id
) as t
)
MERGE PiggyBank_Current as target
USING cte as source
ON target.id = source.id
WHEN MATCHED THEN
UPDATE SET [value] = source.[value]
WHEN NOT MATCHED THEn
INSERT VALUES (source.id, source.[value]);
It collects in CTE all id from inserted and gets sums from 2 tables. Then MERGE PiggyBank_Current table.

How to copy records from inter-linked tables to another in a different database?

I have 3 tables that are inter-linked between each other. The design of the tables are as below.
First (PK:FirstID, vchar:Name, int:Year)
Second (PK:SecondID, FK:FirstID, int:Day, int:Month)
Third (PK:ThirdID, FK:SecondID, int:Speed, vchar:Remark)
I'm trying to copy records from 3 inter-linked tables from Database A to Database B. So my Transact-SQL looks something like this:
INSERT INTO First
(Name, Year)
SELECT Name, Year
FROM DB_A.dbo.First
WHERE Year >= 1992
INSERT INTO Second
(FirstID, Day, Month)
SELECT FirstID, Day, Month
FROM DB_A.dbo.Second S INNER JOIN
DB_A.dbo.First F ON S.FirstID = F.FirstID
WHERE Month > 6
INSERT INTO Third
(SecondID, Speed, Remark)
SELECT SecondID, Speed, Remark
FROM DB_A.dbo.Third T INNER JOIN
DB_A.dbo.Second S ON T.SecondID = S.SecondID INNER JOIN
DB_A.dbo.First F ON F.FirstID = S.FirstID
WHERE Remark <> NULL
These statements works all well and fine until the starting position of First.FirstID in Database A and B becomes not the same due to the three tables in Database B being empty. Hence, the constraint on foreign_key error is produced.
Possible Solutions
Reuse old First.FirstID One of the solution I have figured out is to use reuse the old First.FirstID from Database A. This can be done by setting SET IDENTITY_INSERT TableName ON just before the insert into TableName and including the TableName.TableNameID into the insert statement. However, I'm advised against doing this by my colleagues.
Overwrite Second.FirstID with new First.FirstID and subsequently, Third.SecondID with the new Second.SecondID I'm trying to apply this solution using OUTPUT and TABLE variable by outputting all First.FirstID into a temporary table variable and associate them with table Second similar to this answer However, I'm stuck on how to associate and replace the Second.FirstIDs with the correct IDs in the temporary table. An answer on how to do this would also be accepted as the answer for this question.
Using solution No. 1 and Update the primary and foreign keys using UPDATE CASCADE. I just got this idea but I have a feeling it will be very tedious. More research needs to be done but if there's an answer that shows how to implement this successfully, then I'll accept that answer.
So how do I copy records from 3 inter-linked tables to another 3 similar tables but different primary keys? Are there any better solutions than the ones proposed above?
You can use OUTPUT Clause.
CREATE TABLE #First (NewId INT PRIMARY KEY, OldId INT)
INSERT INTO First
(
Name,
Year,
OldId -- Added new column
)
OUTPUT Inserted.FirstID, Inserted.OldId INTO #First
SELECT
Name,
Year,
FirstID -- Old Id to OldId Column
FROM
DB_A.dbo.First
WHERE
Year >= 1992
Second Table
CREATE TABLE #Second (NewId INT PRIMARY KEY, OldId INT)
INSERT INTO Second
(
FirstID,
Day,
Month,
OldId -- Added new column
)
OUTPUT Inserted.SecondID, Inserted.OldId INTO #Second
SELECT
OF.NewId, --FirstID
Day,
Month,
SecondID
FROM
DB_A.dbo.Second S INNER JOIN
DB_A.dbo.First F ON S.FirstID = F.FirstID INNER JOIN
#First OF ON F.FirstId = OF.OldId -- Old ids here
WHERE
Month > 6
Last one
INSERT INTO Third
(
SecondID,
Speed,
Remark
)
SELECT
OS.NewId, -- SecondID
Speed,
Remark
FROM
DB_A.dbo.Third T INNER JOIN
DB_A.dbo.Second S ON T.SecondID = S.SecondID INNER JOIN
DB_A.dbo.First F ON F.FirstID = S.FirstID INNER JOIN
#Second OS ON S.SecondID = OS.OldId
WHERE Remark <> NULL
First Solution
Using MERGE and OUTPUT together
OUTPUT combined with MERGE function has the ability to retrieve the old primary keys before inserting into the table.
Second Solution
NOTE: This solution only works if you are sure that you have another column that has its values unique in the table besides the table's primary key.
You may use this column as a link between the table in the source database and its sister table in the target database. The code below is an example taking into account that First.Name has unique values when month > 6.
-- no changes to insert code in First table
INSERT INTO First
(Name, Year)
SELECT Name, Year
FROM DB_A.dbo.First
WHERE Year >= 1992
INSERT INTO Second
(FirstID, Day, Month)
SELECT CurrentF.FirstID, Day, Month -- 2. Use the FirstID that has been input in First table
FROM DB_A.dbo.Second S INNER JOIN
DB_A.dbo.First F ON S.FirstID = F.FirstID INNER JOIN
First CurrentF ON CurrentF.Name = F.Name -- 1. Join Name as a link
WHERE Month > 6
INSERT INTO Third
(SecondID, Speed, Remark)
SELECT CurrentS.SecondID, Speed, Remark --5. Get the proper SecondID
FROM DB_A.dbo.Third T INNER JOIN
DB_A.dbo.Second S ON T.SecondID = S.SecondID INNER JOIN
DB_A.dbo.First F ON F.FirstID = S.FirstID INNER JOIN
First CurrentF ON CurrentF.Name = F.Name INNER JOIN -- 3. Join using Name as Link
Second CurrentS ON CurrentS.FirstID= CurrentF.FirstID -- 4. Link Second and First table to get the proper SecondID.
WHERE Remark <> NULL

SELECT Statement in CASE

Please don't downgrade this as it is bit complex for me to explain. I'm working on data migration so some of the structures look weird because it was designed by someone like that.
For ex, I have a table Person with PersonID and PersonName as columns. I have duplicates in the table.
I have Details table where I have PersonName stored in a column. This PersonName may or may not exist in the Person table. I need to retrieve PersonID from the matching records otherwise put some hardcode value in PersonID.
I can't write below query because PersonName is duplicated in Person Table, this join doubles the rows if there is a matching record due to join.
SELECT d.Fields, PersonID
FROM Details d
JOIN Person p ON d.PersonName = p.PersonName
The below query works but I don't know how to replace "NULL" with some value I want in place of NULL
SELECT d.Fields, (SELECT TOP 1 PersonID FROM Person where PersonName = d.PersonName )
FROM Details d
So, there are some PersonNames in the Details table which are not existent in Person table. How do I write CASE WHEN in this case?
I tried below but it didn't work
SELECT d.Fields,
CASE WHEN (SELECT TOP 1 PersonID
FROM Person
WHERE PersonName = d.PersonName) = null
THEN 123
ELSE (SELECT TOP 1 PersonID
FROM Person
WHERE PersonName = d.PersonName) END Name
FROM Details d
This query is still showing the same output as 2nd query. Please advise me on this. Let me know, if I'm unclear anywhere. Thanks
well.. I figured I can put ISNULL on top of SELECT to make it work.
SELECT d.Fields,
ISNULL(SELECT TOP 1 p.PersonID
FROM Person p where p.PersonName = d.PersonName, 124) id
FROM Details d
A simple left outer join to pull back all persons with an optional match on the details table should work with a case statement to get your desired result.
SELECT
*
FROM
(
SELECT
Instance=ROW_NUMBER() OVER (PARTITION BY PersonName),
PersonID=CASE WHEN d.PersonName IS NULL THEN 'XXXX' ELSE p.PersonID END,
d.Fields
FROM
Person p
LEFT OUTER JOIN Details d on d.PersonName=p.PersonName
)AS X
WHERE
Instance=1
Ooh goody, a chance to use two LEFT JOINs. The first will list the IDs where they exist, and insert a default otherwise; the second will eliminate the duplicates.
SELECT d.Fields, ISNULL(p1.PersonID, 123)
FROM Details d
LEFT JOIN Person p1 ON d.PersonName = p1.PersonName
LEFT JOIN Person p2 ON p2.PersonName = p1.PersonName
AND p2.PersonID < p1.PersonID
WHERE p2.PersonID IS NULL
You could use common table expressions to build up the missing datasets, i.e. your complete Person table, then join that to your Detail table as follows;
declare #n int;
-- set your default PersonID here;
set #n = 123;
-- Make sure previous SQL statement is terminated with semilcolon for with clause to parse successfully.
-- First build our unique list of names from table Detail.
with cteUniqueDetailPerson
(
[PersonName]
)
as
(
select distinct [PersonName]
from [Details]
)
-- Second get unique Person entries and record the most recent PersonID value as the active Person.
, cteUniquePersonPerson
(
[PersonID]
, [PersonName]
)
as
(
select
max([PersonID]) -- if you wanted the original Person record instead of the last, change this to min.
, [PersonName]
from [Person]
group by [PersonName]
)
-- Third join unique datasets to get the PersonID when there is a match, otherwise use our default id #n.
-- NB, this would also include records when a Person exists with no Detail rows (they are filtered out with the final inner join)
, cteSudoPerson
(
[PersonID]
, [PersonName]
)
as
(
select
coalesce(upp.[PersonID],#n) as [PersonID]
coalesce(upp.[PersonName],udp.[PersonName]) as [PersonName]
from cteUniquePersonPerson upp
full outer join cteUniqueDetailPerson udp
on udp.[PersonName] = p.[PersonName]
)
-- Fourth, join detail to the sudo person table that includes either the original ID or our default ID.
select
d.[Fields]
, sp.[PersonID]
from [Details] d
inner join cteSudoPerson sp
on sp.[PersonName] = d.[PersonName];

SQL Loop/Crawler

I am trying to figure out some ways to accomplish this script. I import an excel sheet and then I need to populate 5 different tables based on this excel sheet. However for this example I just need help with the initial loop then I think I can work through the rest.
select distinct Department from IPACS_New_MasterList
where Department is not null
This provides me a list of 7 different departments.
Dep1, Dep2, Dep3, Dep4, Dep5, Dep6, Dep7
For each of these departments I need to perform some code.
Step #1:
Insert the department into table_one
I then need to keep the SCOPE_IDENTITY() for the rest of the code.
Step #2
perform the second loop (inserting all functions in that department into table2.
I'm not sure how to really do a foreach row in this select statement loop, or if I need to do something completely different. I've looked at several answers but can't seem to find exactly what I'm looking for.
Sample Data:
Source Table
Dep1, func1, process1, procedure1
dep1, func1, process1, procedure2
dep1, func1, process2, procedure3
dep1, func1, process2, procedure4
dep1, func1, process2, procedure5
dep1, func2, process3, procedure6
dep2, func3, process4, procedure7
My Tables:
My first table is a list of every department from the above query. With a key on the departmentID. Each department can have many functions.
My second table is a list of all functions with a key on functionID and a foreign key on departmentID. Each function must have 1 department and can have many processes
My third table is a list of all processes with a key on processID and a foreign key on functionID. Each process must have 1 function and can have many procedures.
There are two approaches you can use without a loop.
1) If you have candidate keys in your source (department name) just join your source table back to the table you inserted
e.g.
INSERT INTO Department
(Name)
SELECT DISTINCT Dep1
FROM SOURCE;
INSERT INTO Functions
(
Name,
DepartmentID)
SELECT DISTINCT
s.Func1,
d.DepartmentID
FROM
source s
INNER JOIN Department d
on s.dep1 = d.name;
INSERT INTO
processes
(
name,
FunctionID,
[Procedure]
)
SELECT
s.process1,
f.FunctionID,
s.procedure1
FROM
source s
INNER JOIN Department d
on s.dep1 = d.name
INNER JOIN Functions f
on d.DepartmentID = f.departmentID
and s.func1 = f.name;
SQL Fiddle
2) If you don't have candidate keys in your source then you can use the output clause
For example here if a department weren't guaranteed to be unique this would correctly find only the newly add
DECLARE #Department TABLE
(
DepartmentID INT
)
DECLARE #Functions TABLE
(
FunctionID INT
)
INSERT INTO Department
(Name)
OUTPUT INSERTED.DepartmentID INTO #Department
SELECT DISTINCT Dep1
FROM SOURCE
INSERT INTO Functions
(
Name,
DepartmentID)
OUTPUT INSERTED.FunctionID INTO #FunctionID
SELECT DISTINCT
s.Func1,
d.DepartmentID
FROM
source s
INNER JOIN Department d
on s.dep1 = d.name
INNER JOIN #Department d2
ON d.departmentID = d2.departmentID;
INSERT INTO
processes
(
name,
FunctionID,
[Procedure]
)
SELECT
s.process1,
f.FunctionID,
s.procedure1
FROM
source s
INNER JOIN Department d
on s.dep1 = d.name
INNER JOIN Functions f
on d.DepartmentID = f.departmentID
and s.func1 = f.name
INNER JOIN #Functions f2
ON f.Functions = f2.Functions
SELECT * FROM Department;
SELECT * FROm Functions;
SELECT * FROM processes;
SQL Fiddle
If I am understanding what you are trying to do... yes you can use a loop. Its not really talked about and I bet I am going to get some feedback from other SQL developers that its not a best practice. But if you really need to do a loop
DECLARE #rowcount as int
DECLARE #numberOfRows as int
SET #rowcount = 0
SET #numberOfRows = SELECT COUNT(*) from tablename --put in anything to get the number of times to loop.
WHILE #numberOfRows <= #rowcount
BEGIN
--Put whatever process you need to repeat here
SET #rowcount = #rowcount + 1
END
Assuming you have tables set up with an IDENTITY field set for the Primary Key, you can populate each successive table's foreign key by joining to the previous table and the source table, something like:
INSERT INTO Table1
SELECT DISTINCT Department
FROM SourceTable
GO
INSERT INTO Table2
SELECT DISTINCT b.Deptartment_ID, a.Function
FROM SourceTable a
JOIN Table1 b
ON a.Department = b.Department
GO
INSERT INTO Table3
SELECT DISTINCT b.Function_ID, a.Process
FROM SourceTable a
JOIN Table2 b
ON a.Function = b.Function
GO
INSERT INTO Table4
SELECT DISTINCT b.Process_ID, a.Procedure
FROM SourceTable a
JOIN Table3 b
ON a.Process = b.Process
GO