sql check for logical errors in 2 columns - sql

Assuming I have an employee table with 2 columns only:
employee_id
manager_id
All employees added to this table would have an accompanying manager_id that is actually an employee_id that already exists (save for one, the CEO probably doesn't have a manager, but that's not important).
If A is the manager of B, how do we enforce a check such that A's manager can take any value BUT B, thus resulting in a violation of the business rule?

I'd say the best way would be to create a TRIGGER on the insert into the table that would simply check that manager_id NOT IN (SELECT employee_id from employee where manager_id = %insertid%).

Half of the answer is a foreign key: manager_id references employee(employee_id)
The other half is a check constraint, manager_id<>employee_id

The problem goes deeper than that, you want to avoid any cycles in your graph, making it effectively a tree.
I think you're better off doing that at the application level.
UPDATE: But if you prefer to do it with a trigger, take a look at common table expressions (CTEs). You can create a recursive query in a trigger that checks for cycles:
create trigger prevent_management_cycles on employee
instead of update
as
declare #found_rows int
;with cycle_detector (employee_id) as (
select employee_id from inserted
union all
select employee.employee_id from employee
join cycle_detector
on employee.manager_id = cycle_detector.employee_id
)
select #found_rows = count(*)
from cycle_detector
join inserted
on inserted.manager_id = cycle_detector.employee_id
if #found_rows > 0
raiserror('cycle detected!', 1, 1)
else
-- carry on original update
update employee
set employee.manager_id = inserted.manager_id
-- other columns...
from employee
join inserted on employee.employee_id = inserted.employee_id
Note: it's assumed that employee_id is a primary key and manager_id is a foreign key pointing back to employee.employee_id.

Related

creating the sql query for company supervisors

I have four tables
create table emp (emp_ss int, emp_name nvarchar(20));
create table comp(comp_name nvarchar(20), comp_address nvarchar(20));
create table works (emp_ss int, comp_name nvarchar(20));
create table supervises (spv_ss int, emp_ss int );
Here SUPRVISER_SS and EMP_SS are subset of SS. Now I have to find:
the name of all the companies who have more than 4 supervisors
I have made a query for the above problem but not sure whether it is correct or not
SELECT COMP_NAME , COUNT(EMP_SS) FROM WORKS
WHERE EMP_SS IN (SELECT DISTINCT SPV_SS FROM supervises)
GROUP BY COMP_NAME
HAVING COUNT(EMP_SS) > 4;
the name of supervisors who have the largest number of employees
but unable to get the required result of the above condition
SELECT SPV_SS, COUNT(*) max_ FROM supervises GROUP BY SPV_SS
You don't need to have a seperate table for supervisors unless they come with extra information that doesn't belong in the employee table, just add an extra field (foreign key) in Employee table that links to the primary key in the same table.
First question: select company just use a group by companyid clause and then check if the count of supervisors is larger than 4 for.
Second question: select count(empid) and supervisor, use group by supervisor clause and add order by clause on the count column
I explained the logic, as for the actual sql code, you're gonna have to figure that out yourself.

SQL Server - Cascading DELETE with Recursive Foreign Keys

I've spent a good amount of time trying to figure out how to implement a CASCADE ON DELETE for recursive primary keys on SQL Server for some time now. I've read about triggers, creating temporary tables, etc but have yet to find an answer that will work with my database design.
Here is a Boss/Employee database example that will work for demonstration purposes:
TABLE employee
id|name |boss_id
--|---------|-------
1 |John |1
2 |Hillary |1
3 |Hamilton |1
4 |Scott |2
5 |Susan |2
6 |Seth |2
7 |Rick |5
8 |Rachael |5
As you can see, each employee has a boss that is also an employee. So, there is a PK/FK relationship on id/boss_id.
Here is an (abbreviated) table with their information:
TABLE information
emp_id|street |phone
------|-----------|-----
2 |blah blah |blah
6 |blah blah |blah
7 |blah blah |blah
There is a PK/FK on employee.id/information.emp_id with a CASCADE ON DELETE.
For example, if Rick was fired, we would do this:
DELETE FROM employee WHERE id=7
This should delete Rick's rows from both employee and information. Yay cascade!
Now, say we've hit hard times and we need to lay of Hamilton and his entire department. This means that we would need to remove
Hamilton
Scott
Susan
Seth
Rick
Rachael
From both the employee and information tables when we run:
DELETE FROM employee WHERE id=3
I tried a simple CASCADE ON DELETE for id/emp_id, but SQL Server wasn't having it:
Introducing FOREIGN KEY constraint 'fk_boss_employee' on table 'employee' may cause cycles or multiple cascade paths. Specify ON DELETE NO ACTION or ON UPDATE NO ACTION, or modify other FOREIGN KEY constraints.
I was able to use CASCADE ON DELETE on a test database in Access, and it behaved exactly as I wanted it to. Again, I want every possible child, grandchild, great-grandchild, etc of a parent to be deleted if their parent, grandparent, great-grandparent, etc is deleted.
When I tried using triggers, I couldn't seem to get it to trigger itself (eg. when you try to delete Hamilton's employee Susan, first see if Susan has any employees, etc) let alone going down N-number of employees.
So! I think I've provided every detail I can think of. If something still isn't clear, I'll try to improve this description.
Necromancing.
There's 2 simple solutions.
You can either read Microsoft's sorry-excuse(s) of why they didn't
implement this (because it is difficult and time-consuming - and time is money), and explanation of why you don't/shouldn't need it (although you do), and implement the delete-function with a cursor in a stored procedure
because you don't really need delete cascade, because you always have the time to change ALL your and ALL of OTHER people's code (like interfaces to other systems) everywhere, anytime, that deletes an employee (or employees, note: plural) (including all superordinate and subordinate objects [including when a or several new ones are added]) in this database (and any other copies of this database for other customers, especially in production when you don't have access to the database [oh, and on the test system, and the integration system, and local copies of production, test, and integration]
or
you can use a proper DBMS that actually supports recursive cascaded deletes, like PostGreSQL (as long as the graph is directed, and non-cyclic; else ERROR on delete).
PS: That's sarcasm.
Note:
As long as your delete does not stem from a cascade, and you just want to perform a delete on a self-referencing table, you can delete any entry, as long as you remove all subordinate objects as well in the in-clause.
So to delete such an object, do the following:
;WITH CTE AS
(
SELECT id, boss_id, [name] FROM employee
-- WHERE boss_id IS NULL
WHERE id = 2 -- <== this here is the id you want to delete !
UNION ALL
SELECT employee.id, employee.boss_id, employee.[name] FROM employee
INNER JOIN CTE ON CTE.id = employee.boss_id
)
DELETE FROM employee
WHERE employee.id IN (SELECT id FROM CTE)
Assuming you have the following table structure:
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'dbo.employee') AND type in (N'U'))
BEGIN
CREATE TABLE dbo.employee
(
id int NOT NULL,
boss_id int NULL,
[name] varchar(50) NULL,
CONSTRAINT PK_employee PRIMARY KEY ( id )
);
END
GO
IF NOT EXISTS (SELECT * FROM sys.foreign_keys WHERE object_id = OBJECT_ID(N'dbo.FK_employee_employee') AND boss_id_object_id = OBJECT_ID(N'dbo.employee'))
ALTER TABLE dbo.employee WITH CHECK ADD CONSTRAINT FK_employee_employee FOREIGN KEY(boss_id)
REFERENCES dbo.employee (id)
GO
IF EXISTS (SELECT * FROM sys.foreign_keys WHERE object_id = OBJECT_ID(N'dbo.FK_employee_employee') AND boss_id_object_id = OBJECT_ID(N'dbo.employee'))
ALTER TABLE dbo.employee CHECK CONSTRAINT FK_employee_employee
GO
The below might work for you (I haven't tested it so it may require some tweaking). Seems like all you have to do is delete the employees from the bottom of the hierarchy before you delete the ones higher-up. Use a CTE to build the delete hierarchy recursively and order the CTE output descending by the hierarchy level of the employee. Then delete in order.
CREATE PROC usp_DeleteEmployeeAndSubordinates (#empId INT)
AS
;WITH employeesToDelete AS (
SELECT id, CAST(1 AS INT) AS empLevel
FROM employee
WHERE id = #empId
UNION ALL
SELECT e.id, etd.empLevel + 1
FROM employee e
JOIN employeesToDelete etd ON e.boss_id = etd.id AND e.boss_id != e.id
)
SELECT id, ROW_NUMBER() OVER (ORDER BY empLevel DESC) Ord
INTO #employeesToDelete
FROM employeesToDelete;
DECLARE #current INT = 1, #max INT = ##ROWCOUNT;
WHILE #current <= #max
BEGIN
DELETE employee WHERE id = (SELECT id FROM #employeesToDelete WHERE Ord = #current);
SET #current = #current + 1;
END;
GO
This may sound extreme, but I don't think there is a simple baked in option for what you are looking to do. I would suggest creating a proc that would do the following:
Disable FK constraints
get a list of employees to be deleted using a recursive CTE (save this in a temp table)
Delete the rows from the parent / child table
Delete rows from the employee information table
Enable FK Constraints
Wrap the whole thing in a transaction to maintain consistency

How to perform a mass SQL insert to one table with rows from two seperate tables

I need some T-SQL help. We have an application which tracks Training Requirements assigned to each employee (such as CPR, First Aid, etc.). There are certain minimum Training Requirements which all employees must be assigned and my HR department wants me to give them the ability to assign those minimum Training Requirements to all personnel with the click of a button. So I have created a table called TrainingRequirementsForAllEmployees which has the TrainingRequirementID's of those identified minimum TrainingRequirements.
I want to insert rows into table Employee_X_TrainingRequirements for every employee in the Employees table joined with every row from TrainingRequirementsForAllEmployees.
I will add abbreviated table schema for clarity.
First table is Employees:
EmployeeNumber PK char(6)
EmployeeName varchar(50)
Second Table is TrainingRequirementsForAllEmployees:
TrainingRequirementID PK int
Third table (the one I need to Insert Into) is Employee_X_TrainingRequirements:
TrainingRequirementID PK int
EmployeeNumber PK char(6)
I don't know what the Stored Procedure should look like to achieve the results I need. Thanks for any help.
cross join operator is suitable when cartesian product of two sets of data is needed. So in the body of your stored procedure you should have something like:
insert into Employee_X_TrainingRequirements (TrainingRequirementID, EmployeeNumber)
select r.TrainingRequirementID, e.EmployeeNumber
from Employees e
cross join TrainingRequirementsForAllEmployees r
where not exists (
select 1 from Employee_X_TrainingRequirements
where TrainingRequirementID = r.TrainingRequirementID
and EmployeeNumber = e.EmployeeNumber
)

How to improve user defined function with while on SQL Server?

I have a SQL Server 2008 R2 UDF which performs a kind of recursive loop. I mean, I have a table called Employees where in one of my columns I store another Employee id (his boss).
When I get an employee id, I must be able to know the whole department below him. For example:
Employee Joe (ID:1) works for Robert (ID:2)
Employee Robert (ID:2) works for Michelle (ID:3)
I must be able to count the salary (let's suppose it's on the same table) of all employees below Michelle, i.e. Robert and Joe.
Up to now, I created a UDF that returns a table with all employee ids below Michelle and use an EXISTS clause on the queries' where but it performs very poorly.
Do you guys have another idea?
Thank you!
You should probably use a recursive CTE rather than a WHILE loop to find all of the employees. I don't have your tables or data so I've made some up:
create table Employees (
ID int not null primary key,
Name varchar(20) not null,
BigBossID int null foreign key references Employees(ID),
Salary decimal(18,4) not null
)
go
insert into Employees (ID,Name,BigBossID,Salary) values
(1,'Joe',2,2.50),
(2,'Robert',3,19000.75),
(3,'Michelle',null,1234567890.00)
And then I can use this query to find all employees below Michelle:
declare #RootID int
set #RootID = 3
;With EmployeesBelowRoot as (
select ID from Employees where BigBossID = #RootID
union all
select e.ID from Employees e inner join EmployeesBelowRoot ebr on e.BigBossID = ebr.ID
)
select SUM(Salary) from Employees where ID in (select ID from EmployeesBelowRoot)
You could (if you think it's worth it) place the CTE (EmployeesBelowRoot) into a UDF and call it with #RootID as a parameter, but I've just put it directly in the query for now.

TSQL foreign keys on views?

I have a SQL-Server 2008 database and a schema which uses foreign key constraints to enforce referential integrity. Works as intended. Now the user creates views on the original tables to work on subsets of the data only. My problem is that filtering certain datasets in some tables but not in others will violate the foreign key constraints.
Imagine two tables "one" and "two". "one" contains just an id column with values 1,2,3. "Two" references "one". Now you create views on both tables. The view for table "two" doesn't filter anything while the view for table "one" removes all rows but the first. You'll end up with entries in the second view that point nowhere.
Is there any way to avoid this? Can you have foreign key constraints between views?
Some Clarification in response to some of the comments:
I'm aware that the underlying constraints will ensure integrity of the data even when inserting through the views. My problem lies with the statements consuming the views. Those statements have been written with the original tables in mind and assume certain joins cannot fail. This assumption is always valid when working with the tables - but views potentially break it.
Joining/checking all constraints when creating the views in the first place is annyoing because of the large number of referencing tables. Thus I was hoping to avoid that.
I love your question. It screams of familiarity with the Query Optimizer, and how it can see that some joins are redundant if they serve no purpose, or if it can simplify something knowing that there is at most one hit on the other side of a join.
So, the big question is around whether you can make a FK against the CIX of an Indexed View. And the answer is no.
create table dbo.testtable (id int identity(1,1) primary key, val int not null);
go
create view dbo.testview with schemabinding as
select id, val
from dbo.testtable
where val >= 50
;
go
insert dbo.testtable
select 20 union all
select 30 union all
select 40 union all
select 50 union all
select 60 union all
select 70
go
create unique clustered index ixV on dbo.testview(id);
go
create table dbo.secondtable (id int references dbo.testview(id));
go
All this works except for the last statement, which errors with:
Msg 1768, Level 16, State 0, Line 1
Foreign key 'FK__secondtable__id__6A325CF7' references object 'dbo.testview' which is not a user table.
So the Foreign key must reference a user table.
But... the next question is about whether you could reference a unique index that is filtered in SQL 2008, to achieve a view-like FK.
And still the answer is no.
create unique index ixUV on dbo.testtable(val) where val >= 50;
go
This succeeded.
But now if I try to create a table that references the val column
create table dbo.thirdtable (id int identity(1,1) primary key, val int not null check (val >= 50) references dbo.testtable(val));
(I was hoping that the check constraint that matched the filter in the filtered index might help the system understand that the FK should hold)
But I get an error saying:
There are no primary or candidate keys in the referenced table 'dbo.testtable' that matching the referencing column list in the foreign key 'FK__thirdtable__val__0EA330E9'.
If I drop the filtered index and create a non-filtered unique non-clustered index, then I can create dbo.thirdtable without any problems.
So I'm afraid the answer still seems to be No.
It took me some time to figure out the misunderstaning here -- not sure if I still understand completely, but here it is.
I will use an example, close to yours, but with some data -- easier for me to think in these terms.
So first two tables; A = Department B = Employee
CREATE TABLE Department
(
DepartmentID int PRIMARY KEY
,DepartmentName varchar(20)
,DepartmentColor varchar(10)
)
GO
CREATE TABLE Employee
(
EmployeeID int PRIMARY KEY
,EmployeeName varchar(20)
,DepartmentID int FOREIGN KEY REFERENCES Department ( DepartmentID )
)
GO
Now I'll toss some data in
INSERT INTO Department
( DepartmentID, DepartmentName, DepartmentColor )
SELECT 1, 'Accounting', 'RED' UNION
SELECT 2, 'Engineering', 'BLUE' UNION
SELECT 3, 'Sales', 'YELLOW' UNION
SELECT 4, 'Marketing', 'GREEN' ;
INSERT INTO Employee
( EmployeeID, EmployeeName, DepartmentID )
SELECT 1, 'Lyne', 1 UNION
SELECT 2, 'Damir', 2 UNION
SELECT 3, 'Sandy', 2 UNION
SELECT 4, 'Steve', 3 UNION
SELECT 5, 'Brian', 3 UNION
SELECT 6, 'Susan', 3 UNION
SELECT 7, 'Joe', 4 ;
So, now I'll create a view on the first table to filter some departments out.
CREATE VIEW dbo.BlueDepartments
AS
SELECT * FROM dbo.Department
WHERE DepartmentColor = 'BLUE'
GO
This returns
DepartmentID DepartmentName DepartmentColor
------------ -------------------- ---------------
2 Engineering BLUE
And per your example, I'll add a view for the second table which does not filter anything.
CREATE VIEW dbo.AllEmployees
AS
SELECT * FROM dbo.Employee
GO
This returns
EmployeeID EmployeeName DepartmentID
----------- -------------------- ------------
1 Lyne 1
2 Damir 2
3 Sandy 2
4 Steve 3
5 Brian 3
6 Susan 3
7 Joe 4
It seems to me that you think that Employee No 5, DepartmentID = 3 points to nowhere?
"You'll end up with entries in the
second view that point nowhere."
Well, it points to the Department table DepartmentID = 3, as specified with the foreign key. Even if you try to join view on view nothing is broken:
SELECT e.EmployeeID
,e.EmployeeName
,d.DepartmentID
,d.DepartmentName
,d.DepartmentColor
FROM dbo.AllEmployees AS e
JOIN dbo.BlueDepartments AS d ON d.DepartmentID = e.DepartmentID
ORDER BY e.EmployeeID
Returns
EmployeeID EmployeeName DepartmentID DepartmentName DepartmentColor
----------- -------------------- ------------ -------------------- ---------------
2 Damir 2 Engineering BLUE
3 Sandy 2 Engineering BLUE
So nothing is broken here, the join simply did not find matching records for DepartmentID <> 2 This is actually the same as if I join tables and then include filter as in the first view:
SELECT e.EmployeeID
,e.EmployeeName
,d.DepartmentID
,d.DepartmentName
,d.DepartmentColor
FROM dbo.Employee AS e
JOIN dbo.Department AS d ON d.DepartmentID = e.DepartmentID
WHERE d.DepartmentColor = 'BLUE'
ORDER BY e.EmployeeID
Returns again:
EmployeeID EmployeeName DepartmentID DepartmentName DepartmentColor
----------- -------------------- ------------ -------------------- ---------------
2 Damir 2 Engineering BLUE
3 Sandy 2 Engineering BLUE
In both cases joins do not fail, they simply do as expected.
Now I will try to break the referential integrity through a view (there is no DepartmentID= 127)
INSERT INTO dbo.AllEmployees
( EmployeeID, EmployeeName, DepartmentID )
VALUES( 10, 'Bob', 127 )
And this results in:
Msg 547, Level 16, State 0, Line 1
The INSERT statement conflicted with the FOREIGN KEY constraint "FK__Employee__Depart__0519C6AF". The conflict occurred in database "Tinker_2", table "dbo.Department", column 'DepartmentID'.
If I try to delete a department through the view
DELETE FROM dbo.BlueDepartments
WHERE DepartmentID = 2
Which results in:
Msg 547, Level 16, State 0, Line 1
The DELETE statement conflicted with the REFERENCE constraint "FK__Employee__Depart__0519C6AF". The conflict occurred in database "Tinker_2", table "dbo.Employee", column 'DepartmentID'.
So constraints on underlying tables still apply.
Hope this helps, but then maybe I misunderstood your problem.
Peter already hit on this, but the best solution is to:
Create the "main" logic (that filtering the referenced table) once.
Have all views on related tables join to the view created for (1), not the original table.
I.e.,
CREATE VIEW v1 AS SELECT * FROM table1 WHERE blah
CREATE VIEW v2 AS SELECT * FROM table2 WHERE EXISTS
(SELECT NULL FROM v1 WHERE v1.id = table2.FKtoTable1)
Sure, syntactic sugar for propagating filters for views on one table to views on subordinate tables would be handy, but alas, it's not part of the SQL standard. That said, this solution is still good enough -- efficient, straightforward, maintainable, and guarantees the desired state for the consuming code.
If you try to insert, update or delete data through a view, the underlying table constraints still apply.
Something like this in View2 is probably your best bet:
CREATE VIEW View2
AS
SELECT
T2.col1,
T2.col2,
...
FROM
Table2 T2
INNER JOIN Table1 T1 ON
T1.pk = T2.t1_fk
If rolling over tables so that Identity columns will not clash, one possibility would be to use a lookup table that referenced the different data tables by Identity and a table reference.
Foreign keys on this table would work down the line for referencing tables.
This would be expensive in a number of ways
Referential integrity on the lookup table would have to be be enforced using triggers.
Additional storage of the lookup table and indexing in addition to the data tables.
Data reading would almost certainly involve a Stored Procedure or three to execute a filtered UNION.
Query plan evaluation would also have a development cost.
The list goes on but it might work on some scenarios.
Using Rob Farley's schema:
CREATE TABLE dbo.testtable(
id int IDENTITY(1,1) PRIMARY KEY,
val int NOT NULL);
go
INSERT dbo.testtable(val)
VALUES(20),(30),(40),(50),(60),(70);
go
CREATE TABLE dbo.secondtable(
id int NOT NULL,
CONSTRAINT FK_SecondTable FOREIGN KEY(id) REFERENCES dbo.TestTable(id));
go
CREATE TABLE z(n tinyint PRIMARY KEY);
INSERT z(n)
VALUES(0),(1);
go
CREATE VIEW dbo.SecondTableCheck WITH SCHEMABINDING AS
SELECT 1 n
FROM dbo.TestTable AS t JOIN dbo.SecondTable AS s ON t.Id = s.Id
CROSS JOIN dbo.z
WHERE t.Val < 50;
go
CREATE UNIQUE CLUSTERED INDEX NoSmallIds ON dbo.SecondTableCheck(n);
go
I had to create a tiny helper table (dbo.z) in order to make this work, because indexed views cannot have self joins, outer joins, subqueries, or derived tables (and TVCs count as derived tables).
Another approach, depending on your requirements, would be to use a stored procedure to return two recordsets. You pass it filtering criteria and it uses the filtering criteria to query table 1, and then those results can be used to filter the query to table 2 so that it's results are also consistent. Then you return both results.
You could stage the filtered table 1 data to another table. The contents of this staging table are your view 1, and then you build view 2 via a join of the staging table and table 2. This way the proccessing for filtering table 1 is done once and reused for both views.
Really what it boils down to is that view 2 has no idea what kind of filtering you performed in view 1, unless you tell view 2 the filtering criteria, or make it somehow dependent on the results of view 1, which means emulating the same filtering that occurs on view1.
Constraints don't perform any kind of filtering, they only prevent invalid data, or cascade key changes and deletes.
No, you can't create foreign keys on views.
Even if you could, where would that leave you? You would still have to declare the FK after creating the view. Who would declare the FK, you or the user? If the user is sophisticated enough to declare a FK, why couldn't he add an inner join to the referenced view? eg:
create view1 as select a, b, c, d from table1 where a in (1, 2, 3)
go
create view2 as select a, m, n, o from table2 where a in (select a from view1)
go
vs:
create view1 as select a, b, c, d from table1 where a in (1, 2, 3)
go
create view2 as select a, m, n, o from table2
--# pseudo-syntax for fk:
alter view2 add foreign key (a) references view1 (a)
go
I don't see how the foreign key would simplify your job.
Alternatively:
Copy the subset of data into another schema or database. Same tables, same keys, less data, faster analysis, less contention.
If you need a subset of all the tables, use another database. If you only need a subset of some tables, use a schema in the same database. That way your new tables can still reference the non-copied tables.
Then use the existing views to copy the data over. Any FK violations will raise an error and identify which views require editing. Create a job and schedule it daily, if necessary.
From a purely data integrity perspective (and nothing to do with the Query Optimizer), I had considered an Indexed View. I figured you could make a unique index on it, which could be broken when you try to have broken integrity in your underlying tables.
But... I don't think you can get around the restrictions of indexed views well enough.
For example:
You can't use outer joins, or sub-queries. That makes it very hard to find the rows that don't exist in the view. If you use aggregates, you can't use HAVING, so that cuts out some options you could use there too. You can't even have constants in an indexed view if you have grouping (whether or not you use a GROUP BY clause), so you can't even try putting an index on a constant field so that a second row will fall over. You can't use UNION ALL, so the idea of having a count which will break a unique index when it hits a second zero won't work.
I feel like there should be an answer, but I'm afraid you're going to have to take a good look at your actual design and work out what you really need. Perhaps triggers (and good indexes) on the tables involved, so that any changes that might break something can roll it all that.
But I was really hoping to be able to suggest something that the Query Optimizer might be able to leverage to help the performance of your system, but I don't think I can.