SQL Loop/Crawler

SQL Loop/Crawler - sql

I am trying to figure out some ways to accomplish this script. I import an excel sheet and then I need to populate 5 different tables based on this excel sheet. However for this example I just need help with the initial loop then I think I can work through the rest.
select distinct Department from IPACS_New_MasterList
where Department is not null
This provides me a list of 7 different departments.
Dep1, Dep2, Dep3, Dep4, Dep5, Dep6, Dep7
For each of these departments I need to perform some code.
Step #1:
Insert the department into table_one
I then need to keep the SCOPE_IDENTITY() for the rest of the code.
Step #2
perform the second loop (inserting all functions in that department into table2.
I'm not sure how to really do a foreach row in this select statement loop, or if I need to do something completely different. I've looked at several answers but can't seem to find exactly what I'm looking for.
Sample Data:
Source Table
Dep1, func1, process1, procedure1
dep1, func1, process1, procedure2
dep1, func1, process2, procedure3
dep1, func1, process2, procedure4
dep1, func1, process2, procedure5
dep1, func2, process3, procedure6
dep2, func3, process4, procedure7
My Tables:
My first table is a list of every department from the above query. With a key on the departmentID. Each department can have many functions.
My second table is a list of all functions with a key on functionID and a foreign key on departmentID. Each function must have 1 department and can have many processes
My third table is a list of all processes with a key on processID and a foreign key on functionID. Each process must have 1 function and can have many procedures.

There are two approaches you can use without a loop.
1) If you have candidate keys in your source (department name) just join your source table back to the table you inserted
e.g.
INSERT INTO Department
(Name)
SELECT DISTINCT Dep1
FROM SOURCE;
INSERT INTO Functions
(
Name,
DepartmentID)
SELECT DISTINCT
s.Func1,
d.DepartmentID
FROM
source s
INNER JOIN Department d
on s.dep1 = d.name;
INSERT INTO
processes
(
name,
FunctionID,
[Procedure]
)
SELECT
s.process1,
f.FunctionID,
s.procedure1
FROM
source s
INNER JOIN Department d
on s.dep1 = d.name
INNER JOIN Functions f
on d.DepartmentID = f.departmentID
and s.func1 = f.name;
SQL Fiddle
2) If you don't have candidate keys in your source then you can use the output clause
For example here if a department weren't guaranteed to be unique this would correctly find only the newly add
DECLARE #Department TABLE
(
DepartmentID INT
)
DECLARE #Functions TABLE
(
FunctionID INT
)
INSERT INTO Department
(Name)
OUTPUT INSERTED.DepartmentID INTO #Department
SELECT DISTINCT Dep1
FROM SOURCE
INSERT INTO Functions
(
Name,
DepartmentID)
OUTPUT INSERTED.FunctionID INTO #FunctionID
SELECT DISTINCT
s.Func1,
d.DepartmentID
FROM
source s
INNER JOIN Department d
on s.dep1 = d.name
INNER JOIN #Department d2
ON d.departmentID = d2.departmentID;
INSERT INTO
processes
(
name,
FunctionID,
[Procedure]
)
SELECT
s.process1,
f.FunctionID,
s.procedure1
FROM
source s
INNER JOIN Department d
on s.dep1 = d.name
INNER JOIN Functions f
on d.DepartmentID = f.departmentID
and s.func1 = f.name
INNER JOIN #Functions f2
ON f.Functions = f2.Functions
SELECT * FROM Department;
SELECT * FROm Functions;
SELECT * FROM processes;
SQL Fiddle

If I am understanding what you are trying to do... yes you can use a loop. Its not really talked about and I bet I am going to get some feedback from other SQL developers that its not a best practice. But if you really need to do a loop
DECLARE #rowcount as int
DECLARE #numberOfRows as int
SET #rowcount = 0
SET #numberOfRows = SELECT COUNT(*) from tablename --put in anything to get the number of times to loop.
WHILE #numberOfRows <= #rowcount
BEGIN
--Put whatever process you need to repeat here
SET #rowcount = #rowcount + 1
END

Assuming you have tables set up with an IDENTITY field set for the Primary Key, you can populate each successive table's foreign key by joining to the previous table and the source table, something like:
INSERT INTO Table1
SELECT DISTINCT Department
FROM SourceTable
GO
INSERT INTO Table2
SELECT DISTINCT b.Deptartment_ID, a.Function
FROM SourceTable a
JOIN Table1 b
ON a.Department = b.Department
GO
INSERT INTO Table3
SELECT DISTINCT b.Function_ID, a.Process
FROM SourceTable a
JOIN Table2 b
ON a.Function = b.Function
GO
INSERT INTO Table4
SELECT DISTINCT b.Process_ID, a.Procedure
FROM SourceTable a
JOIN Table3 b
ON a.Process = b.Process
GO

Related

Assign lowest record ID with Outer Apply

Assume I have those tables:
CREATE TABLE Employee (ID int, EmployeeIdentifier varchar(100),ManagerIdentifier varchar(100))
CREATE TABLE EmployeeManager (ID int, EmployeeID varchar(100))
INSERT Employee
VALUES
(1,'apple','apple'),
(2,'banana','apple'),
(3,'citrus','apple'),
(4,'grape','grape'),
(5,'grape','grape'),
(6,'grape','grape')
INSERT EmployeeManager
VALUES
(1,1),
(2,1),
(3,1),
(4,4),
(5,5),
(6,5)
For Employee.ID IN (1,2,3), records in EmployeeManager look fine.
But in Employee.ID IN (4,5,6) we can see many duplicates. We are not allowed to delete any records from Employee table. But we are free to assign EmpoyeeManager.EmployeeID value. Since there is only one Actual record for Grape and the rest is duplicate, I want to assign EmpoyeeManager.EmployeeID to a minimum value Employee.ID from all duplicated grape records in Employee table, aka to 4.
I have this query,
UPDATE d SET EmployeeID = l.ID
FROM dbo.EmployeeManager d
INNER JOIN Employee s on d.ID=s.ID
OUTER APPLY (
SELECT ID
FROM Employee l
WHERE s.ManagerIdentifier=l.EmployeeIdentifier
) l
WHERE
EXISTS (
SELECT d.EmployeeID
EXCEPT
SELECT l.ID
)
If you keep running it you will see that EmployeeManager.EmployeeID values for ID (4,5,6) will keep changing.
How I can I update above update statement to assign to the lowest value of Employee.ID for all EmployeeManager.ID (4,5,6), aka to 4?
We are not allowed to run one time fix script, because corrupted data to above table can keep coming.
Desired output after running above update statement should be

You need TOP (1) and ORDER BY in the subquery to pick out a specific row
UPDATE d SET EmployeeID = l.ID
FROM dbo.EmployeeManager d
INNER JOIN Employee s on d.ID=s.ID
OUTER APPLY (
SELECT TOP (1) ID
FROM Employee l
WHERE s.ManagerIdentifier = l.EmployeeIdentifier
ORDER BY ID
) l
WHERE
EXISTS (
SELECT d.EmployeeID
EXCEPT
SELECT l.ID
)
You appear to have a normalization issue, as the Manager is defined in two places
I suggest you use better aliases for your tables, they are not very memorable
You can change your OUTER to CROSS, and then you can use a standard <> instead of the EXISTS/EXCEPT
CROSS APPLY (
SELECT TOP (1) ID
FROM Employee l
WHERE s.ManagerIdentifier = l.EmployeeIdentifier
ORDER BY ID
) l
WHERE d.EmployeeID <> l.ID

Referential integrity between tables in SQL Server

I have 2 tables, Members and Enrollments. Both tables can be joined using primary key Member ID.
I need to write a query which returns all the members in the Members table which don't have a corresponding row in the Enrollments table and vice versa.
This is what I have so far:
IF OBJECT_ID('tempdb..#memberswithoutenrollments') IS NOT NULL
DROP TABLE #memberswithoutenrollments
SELECT m.*
INTO #memberswithoutenrollments
FROM ABC_Members m
LEFT OUTER JOIN ABC_MemEnrollment e ON m.MemberID = MemberID

FULL JOIN is a simple method for comparing lists between two tables:
SELECT COALESCE(e.MemberID, m.MemberID),
(CASE WHEN e.MemberID IS NULL THEN 'No Enrollments' ELSE 'No Member' END)
FROM ABC_Members m FULL JOIN
ABC_MemEnrollment e
ON m.MemberID = e.MemberID
WHERE e.MemberID IS NULL OR m.MemberID IS NULL;
But if you have proper foreign key relationships, then you should never have enrollments without members.

You can use NOT IN to your benefit here.
WITH
-- Create a list of all of the matches
in_table AS
(
SELECT
Member_ID
FROM
Enrollments
WHERE
Members.MemberID = Enrollments.Member_ID
),
result_table AS
(
SELECT
*
FROM
Members
-- Grab only the values from members that DO NOT APPEAR in in_table
WHERE
MemberID NOT IN (SELECT DISTINCT FROM in_table)
)
-- Grab all results
SELECT * FROM result_table

How to insert data in multiple rows of temp tables in sql

How I can insert in same row for example I want to insert all these columns data in first row then second and so on. But my query is inserting data when customer name data is complete, status data is inserted after one row of customer number last data.
CREATE TABLE #tblCustomer
(
CustomerNumber NVARCHAR(1000),
Status NVARCHAR (1000),
CustomerType NVARCHAR (1000)
)
INSERT
INTO #tblCustomer (CustomerNumber)
Select c.CustomerNumber
From Customer.Customer c
INSERT
INTO #tblCustomer (Status)
Select ses.Name
From Customer.Customer c
Left Outer Join COM.StatusEngine_EntityStatus sees
On c.Status = sees.EntityStatusId
And sees.EntityId = 'CustomerStatus'
Join COM.StatusEngine_Status ses
On sees.Status = ses.Status
INSERT
INTO #tblCustomer (CustomerType)
select t.Description
From Customer.Customer c
Join Customer.Type t
On c.TypeId = t.pkTypeId
Receiving output:
0001 null null
0002 null null
NULL active null
NULL active null
NULL null individual
NULL null individual
Expected Output:
0001 active individual
0002 active individual

Without knowing more about your tables, you can insert the first records like so...
INSERT INTO #tblCustomer (CustomerNumber)
select c.CustomerNumber from Customer.Customer c
And then update the remaining columns this way...
UPDATE #tblCustomer
set #tblCustomer.Status = c.Status
from Customer.Customer c
left outer join COM.StatusEngine_EntityStatus sees
on c.Status = sees.EntityStatusId and sees.EntityId = 'CustomerStatus'
join COM.StatusEngine_Status ses
on sees.Status = ses.Status
join #tblCustomer temp
on c.CustomerNumber = temp.CustomerNumber
However doing it like this is really inefficient, you should strive to create an insert that updates all columns in one go.

You can do it like this (I have verified the code with the Northwind sample database from Microsoft - I have chosen that one since you can use it for each SQL server version since SQL 2000):
declare #NumberOfItems int = 10;
CREATE TABLE #tblCustomer (
CustomerNumber NVARCHAR(1000)
,Name NVARCHAR (1000)
,CustomerType NVARCHAR (1000))
insert into #tblCustomer
select CustomerNumber, Name, Status from (select top(#NumberOfItems) ROW_NUMBER() OVER(ORDER BY CustomerID) as No, CustomerID as CustomerNumber from Customers) c
left join (select * from (select top(#NumberOfItems) ROW_NUMBER() OVER(ORDER BY ContactName) as No, ContactName as Name from Customers) q2) j1 on c.No=j1.No
left join (select * from (select top(#NumberOfItems) ROW_NUMBER() OVER(ORDER BY ContactTitle) as No, ContactTitle as Status from Customers) q3) j2 on c.No=j2.No
select * from #tblCustomer
drop table #tblCustomer
It will create a column with numbers from 1 to n for each element you want to import and then it joins it together.
The result of this query is:
Note: While this works, it is not the preferred way to do it, because there is no primary key - normally one would look for primary key / foreign key relationships to join the data together. The way you're intending to fill it puts data together which doesn't necessarily belong together (here each column is sorted and then put together by its row number - i.e. it picks values from rows sorted by its extract column and then putting them together again). If you have no primary key because you're importing data from other sources, you can add WHERE clauses to create a better connection between the inner and the outer select statements - you can find a nice article which might help you with such kind of subqueries here.

This is untested, however, I believe this is what you're after:
INSERT INTO #tblCustomer (CustomerNumber, [Status], CustomerType))
SELECT c.CustomerNumber, ses.[Name], t.[Description]
FROM Customer.Customer c
JOIN COM.StatusEngine_EntityStatus sees ON c.Status = sees.EntityStatusId --Changed to JOIN, as it is turned into a implicit INNER join by the next JOIN
AND sees.EntityId = 'CustomerStatus'
JOIN COM.StatusEngine_Status ses ON sees.[Status] = ses.[Status];
Note my comment regarding your LEFT OUTER JOIN, in that I've changed it to an INNER JOIN.

straight forward SQL here:
CREATE TABLE #tblCustomer
(
CustomerNumber NVARCHAR(1000),
Status NVARCHAR (1000),
CustomerType NVARCHAR (1000)
)
INSERT INTO #tblCustomer (CustomerNumber, Status, CustomerType)
SELECT DISTINCT
c.CustomerNumber,
ses.Name,
t.Description
FROM Customer.Customer c
LEFT OUTER JOIN COM.StatusEngine_EntityStatus sees
On c.Status = sees.EntityStatusId
And sees.EntityId = 'CustomerStatus'
LEFT OUTER JOIN COM.StatusEngine_Status ses
On sees.Status = ses.Status
LEFT OUTER JOIN Customer.Type t
On c.TypeId = t.pkTypeId

SQL Conditional JOIN

I have three tables 'Employees', 'Departments' and 'EmployeesInDepartments'
The 'Employees' tables references the 'Departments' table (Each employee must have a DepartmentId). However, an employee can exist in multiple departments by adding entries (EmployeeId and DepartmentId) to 'EmployeeInDepartments' table.
I currently have the following stored procedure to retrieve employees by departmentId:
CREATE PROCEDURE dbo.CollectEmployeesByDepartmentId
(
#DepartmentId int,
#IsDeleted bit
)
AS
BEGIN
SELECT Employees.*
FROM Employees
WHERE ((Employees.IsDeleted = #IsDeleted )
AND ((Employees.DepartmentId = #DepartmentId)
OR (Employees.EmployeeId IN (SELECT EmployeesInDepartments.EmployeeId
FROM EmployeesInDepartments
WHERE (EmployeesInDepartments.DepartmentId = #DepartmentId)
)
)
)
)
END
How can I optimize this stored procedure and possibly use JOINS?

My first recommendation to you is to remove department Id from the employee table. Insert all records to the employees in Departments table.
Then it's a simple inner join.
And of course, never use select * in production code.

Here's my re-write of your query:
WITH summary AS (
SELECT e.*
FROM EMPLOYEES e
WHERE e.isdeleted = #IsDeleted
AND e.parentid = 0)
SELECT a.*
FROM summary a
WHERE a.departmentid = #DepartmentId
UNION
SELECT b.*
FROM summary b
JOIN EMPLOYEESINDEPARTMENTS ed ON ed.employeeid = b.employeeid
AND ed.departmentid = #DepartmentId
The UNION is necessary to remove duplicates - if you know there'll never be duplicates, change UNION to UNION ALL.
The CTE called "summary" doesn't provide any performance benefit, it's just shorthand.

SELECT E.*
FROM Employees E
Left Join EmployeesInDepartments EID ON E.EmployeeId = EID.EmployeeId
And E.DepartmentId <> #DepartmentId
WHERE E.IsDeleted = #IsDeleted
And
(
E.DepartmentId = #DepartmentId
Or (EID.DepartmentId = #DepartmentId)
)
Edit to include IsDeleted logic.
I do agree with some of the other answers, your design should probably be changed. But this query should do it. If you have duplicates in EmployeesInDepartments, you can change that to a Select Distinct.

I would suggest you change the IN clause used in your query to WHERE EXISTS.
If you are using IN in your query it means you are not taking benifits of the indexes defined on the table, and the query will perform a complete table scan which impacts the query performance.
You can check this thread to convert a IN to WHERE EXISTS:
Changing IN to EXISTS in SQL

Writing a complex trigger

I am using SQL Server 2000. I am writing a trigger that is executed when a field Applicant.AppStatusRowID
Table Applicant is linked to table Location, table Company & table AppStatus.
My issue is creating the joins in my query.
When Applicant.AppStatusRowID is updated, I want to get the values from
Applicant.AppStatusRowID, Applicant.FirstName, Applicant.Lastname, Location.LocNumber, Location.LocationName, Company.CompanyCode, AppStatus.DisplayText
The joins would be :
Select * from Applicant A
Inner Join AppStatus ast on ast.RowID = a.AppStatusRowID
Inner Join Location l on l.RowID = a.LocationRowID
Inner Join Company c on c.RowID = l.CompanyRowID
This is to be inserted into an Audit table (fields are ApplicantID, LastName, FirstName, Date, Time, Company, Location Number, Location Name, StatusDisposition, User)
My issue is the query for the inner join...

First lets introduce you to the inserted and deleted pseudotables which are available only in triggers. Inserted has new values and delted has old values or records being deleted.
You do not want to insert all records into your audit table only those in inserted.
So to insert into an audit table you might want something like inside the trigger code:
insert Myaudittable (<insert field list here>)
Select <insert field list here> from Inserted i
Inner Join AppStatus ast on ast.RowID = i.AppStatusRowID
Inner Join Location l on l.RowID = i.LocationRowID
Inner Join Company c on c.RowID = l.CompanyRowID
I would personally add columns for old and new values, a column for the type of change and what the date of the change and what user made the change, but you I'm sure have your own requirement to follow.
Suggest you read about triggers in Books online as they can be tricky to get right.
Here's one way to test and debug trigger that I often use. First I create temp tables names #delted and #inserted that have the sturcture of the table I'm going to put the trigger on. Then I write the code to use those instead of the deleted or inserted tables. That wa y I can look at things as I go and make sure everything is right before I change the code to a trigger. Example below with you code added in and modified slightly:
Create table #inserted(Rowid int, lastname varchar(100), firstname varchar(100), appstatusRowid int)
Insert #inserted
select 1, 'Jones', 'Ed', 30
union all
select 2, 'Smith', 'Betty', 20
Create table #deleted (Rowid int, lastname varchar(100), firstname varchar(100), appstatusRowid int)
Insert #deleted
select 1, 'Jones', 'Ed', 10
union all
select 2, 'Smith', 'Betty', 20
--CREATE TRIGGER tri_UpdateAppDisp ON dbo.Test_App
--For Update
--AS
--If Update(appstatusrowid)
IF exists (select i.appstatusRowid from #inserted i join #deleted d on i.rowid = d.rowid
Where d.appstatusrowid <> i.appstatusrowid)
BEGIN
--Insert AppDisp(AppID, LastName, FirstName, [DateTime],Company,Location,LocationName, StatusDisp,[Username])
Select d.Rowid,d.LastName, d.FirstName, getDate(),C.CompanyCode,
l.locnum,l.locname, ast.Displaytext, SUSER_SNAME()+' '+User
From #deleted d
Join #inserted i on i.rowid = d.rowid
--From deleted d
--Join inserted i on i.rowid = d.rowid
Inner join Test_App a with (nolock) on a.RowID = d.rowid
inner join location l with (nolock) on l.rowid = d.Locationrowid
inner join appstatus ast with (nolock) on ast.rowid = d.appstatusrowid
inner join company c with (nolock) on c.rowid = l.CompanyRowid
Where d.appstatusrowid <> i.appstatusrowid)
end
Once you get the data for the select correct, then it is easy to uncomment out the trigger code and the insert line and change #deleted or #inserted to deleted or inserted.
You'll note I had two records in the temp tables, one of which met your condition and one of which did not. This allows you to test mulitple record updates as well as results that meet the condition and ones that don't. All triggers should be written to handle multiple records as they are not fired row-by-row but by batch.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Loop/Crawler - sql

Related

Assign lowest record ID with Outer Apply

Referential integrity between tables in SQL Server

How to insert data in multiple rows of temp tables in sql

SQL Conditional JOIN

Writing a complex trigger

Categories

Resources