Check for uniqueness within update statement - sql

I have a table that links a class to the students that are in that class:
create table class_student
(class_id int,
student_id int,
constraint class_student_u unique nonclustered (class_id, student_id))
If I want to transfer all the classes from one student to another (remove one student from all the classes he/she is enrolled in and add another student to each of the classes the old student was enrolled in), I use the following query:
update class_student
set student_id = #newStudent
where student_id = #oldStudent
and class_id not in (select class_id
from class_student
where student_id = #newStudent)
delete from class_student
where student_id = #oldStudent
How can I transfer the classes from more than one student to the new student? I can't just put where student_id in (#oldStudent1, #oldStudent2) because if both old students are in the same class, after running the above query there will be a violation of the unique constraint. Also, I'd like to do the update in as few queries if possible (I could just run the above queries twice, but I'd like to do it in fewer).
I'm using SQL Server 2008 R2.
Edit: To clarify, here's an example:
class_id student_id
===================
1 1
1 2
2 3
3 1
3 3
4 2
4 3
This means that student 1 is in class 1 and 3, student 2 is in class 1 and 4, and student 3 is in class 2, 3, and 4. If I want to transfer all the classes from student 1 to student 3, I would run the following query:
update class_student
set student_id = 3
where student_id = 1
and class_id not in (select class_id
from class_student
where student_id = 3)
delete from class_student
where student_id = 1
Our data would look like this:
class_id student_id
===================
1 3
1 2
2 3
3 3
4 2
4 3
If, instead, I had run this query:
update class_student
set student_id = 3
where student_id in (1, 2)
and class_id not in (select class_id
from class_student
where student_id = 3)
delete from class_student
where student_id in (1, 2)
Ignoring the unique constraint on the table, the data would look like this:
class_id student_id
===================
1 3
1 3
2 3
3 3
4 3
The double (1, 3) record is what I'm trying to avoid, because it will cause a unique constraint violation in the table.

When setting up the original table you should always include a unique row id with which to reference any specific row (please see below the 'identity' column called row_id):
DROP TABLE class_student
create table class_student
(
row_id int identity(1,1),
class_id int,
student_id int,
constraint class_student_u unique nonclustered (class_id, student_id)
)
insert class_student (class_id,student_id) values (1,1)
insert class_student (class_id,student_id) values (1,2)
insert class_student (class_id,student_id) values (2,3)
insert class_student (class_id,student_id) values (3,1)
insert class_student (class_id,student_id) values (3,3)
insert class_student (class_id,student_id) values (4,2)
insert class_student (class_id,student_id) values (4,3)
In a situation where students 1 and 2 are leaving and you are passing any classes they were taking to student 3 (unless student 3 is already attending those classes), the code could
look something like this:
WITH CTE
AS
(
SELECT row_Id,class_id,student_id,RN = ROW_NUMBER()OVER(PARTITION BY
class_id ORDER BY class_id) FROM class_student WHERE student_id in (1,2,3)
)
DELETE FROM class_student where class_id in (select class_id from
class_student group by class_id having count(class_id) > 1) and student_id
<> 3 and row_id not in (select row_id from cte where student_id <> 3 and
rn >= 2)
Update class_student set student_id = 3
I am using a 'common table expression' with 'RANK' to number each class_id according to the number of rows bearing the same class_id. To see this you can run the code below after
creating the class_student table and inserting the data (see top) but before you run the CTE code above:
WITH CTE
AS
(
SELECT row_Id,class_id,student_id,RN = ROW_NUMBER()OVER(PARTITION BY
class_id ORDER BY class_id) FROM class_student WHERE student_id in (1,2,3)
)
SELECT * FROM CTE
Because class_id 1,3 and 4 are duplicated, they have a value of 2 in the RN (Row Number) column.
I'm using this result in the CTE to delete the rows we don't need from the class_student table and this is where the importance of always having a unique row_id can be seen.
The Delete query deletes rows in the class_student table which are Class ID duplicates. In the case of a class attended by both student 3 and one or both of the other students it
takes the rows where the Student ID is not 3 (because Student 3 is not leaving).
To do this successfully (without taking rows that we want to retain to be assigned to student 3), it requires (by comparing row_id's) that rows where RN = 2 (i.e. class_id is duplicated)
and student_id is not 3 are retained so that we keep one of the rows for Classes that both student 1 and 2 were doing but student 3 was not (i.e. where neither student_id was 3).
Finally, update all remaining rows in the table to a student_id of 3 so that Student 3 gets all the courses.
To see the result you can run:
select * from class_student

I think you'll need at least 2 DML statements to accomplish your goal. And if you really need it to happen in one go, then you can wrap the statements in a stored procedure.
insert into class_student (class_id, student_id)
select distinct class_id, #newStudent
from class_student
where student_id in (#oldStudent1, #oldStudent2)
and class_id not in (select class_id
from class_student
where student_id = #newStudent);
delete from class_student
where student_id in (#oldStudent1, #oldStudent2);
EDIT: Fixed insert to include the "not in" clause.

Related

Get every unique pair combination of a column in SQL

Lets say I have given table:
1 A
2 A
3 A
How do I JOIN / combine the table with itself so I get every possible unique pair combination of the first column:
1 1 A
1 2 A
1 3 A
2 1 A
2 2 A
2 3 A
...
You can do something like this.
Cross JOIN is used for cross product
-- create
CREATE TABLE EMPLOYEE (
empId INTEGER PRIMARY KEY,
name TEXT NOT NULL
);
-- insert
INSERT INTO EMPLOYEE VALUES (0001, 'Clark');
INSERT INTO EMPLOYEE VALUES (0002, 'Dave');
INSERT INTO EMPLOYEE VALUES (0003, 'Ava');
-- fetch
SELECT e1.empId, e2.empId, e1.name FROM EMPLOYEE e1
CROSS JOIN EMPLOYEE e2;

SQL - EXIST OR ALL?

I have two different table student and grades;
grade table has an attribute student_id which references student_id from student table.
How do I find which student has every grade that exists?
If this is not clear,
Student ID Name
1 1 John
2 2 Paul
3 3 George
4 4 Mike
5 5 Lisa
Grade Student_Id Course Grade
1 1 Math A
2 1 English B
3 1 Physics C
4 2 Math A
5 2 English A
6 2 Physics B
7 3 Economics A
8 4 Art C
9 5 Biology A
Assume there is only grade a,b,c (no d, e or fail)
I want to find only John because He has grade a,b,c while
other student like Paul(2) should not be selected because he does not have grade c. It does not matter which course he took, I just need to find if he has all the grades out there available.
Feel like I should something like exist or all function in sql but not sure.
Please help. Thank you in advance.
I would use GROUP BY and HAVING, but like this:
SELECT s.Name
FROM Student s JOIN
Grade g
ON s.ID = g.Student_Id
GROUP BY s.id, s.Name
HAVING COUNT(DISTINCT g.Grade) = (SELECT COUNT(DISTINCT g2.grade) FROM grade g2);
You say "all the grades out there", so the query should not use a constant for that.
You can use HAVING COUNT(DISTINCT Grade) = 3 to check that the student has all 3 grades:
SELECT Name
FROM Student S
JOIN Grade G ON S.ID = G.Student_Id
GROUP BY Name
HAVING COUNT(DISTINCT Grade) = 3
Guessing at S.ID vs S.Student on the join. Not sure what the difference is there.
By using exists
select * from student s
where exists ( select 1
from grades g where g.Student_Id=s.ID
group by g.Student_Id
having count(distinct Grade)=3
)
Example
with Student as
(
select 1 as id,'John' as person
union all
select 2 as id,'Paul' as person
union all
select 3 as id,'jorge'
),
Grades as
(
select 1 as Graden, 1 as Student_Id, 'Math' as Course, 'A' as Grade
union all
select 2 as Graden, 1 as Student_Id, 'English' as Course, 'B' as Grade
union all
select 3 as Graden, 1 as Student_Id, 'Physics' as Course, 'C' as Grade
union all
select 4 as Graden, 2 as Student_Id, 'Math' as Course, 'A' as Grade
union all
select 5 as Graden, 2 as Student_Id, 'English' as Course, 'A' as Grade
union all
select 6 as Graden, 2 as Student_Id, 'Physics' as Course, 'B' as Grade
)
select * from Student s
where exists ( select 1
from Grades g where g.Student_Id=s.ID
group by g.Student_Id
having count(distinct Grade)=3
)
Note having count(distinct Grade)=3 i used this as in your sample data grade type is 3
Before delving into the answer, here's a working SQL Fiddle Example so you can see this in action.
As Gordon Linoff points out in his excellent answer, you should use GroupBy and Having Count(Distinct ... ) ... as an easy way to check.
However, I'd recommend changing your design to ensure that you have tables for each concern.
Currently your Grade table holds each student's grade per course. So it's more of a StudentCourse table (i.e. it's the combination of student and course that's unique / gives you that table's natural key). You should have an actual Grade table to give you the list of available grades; e.g.
create table Grade
(
Code char(1) not null constraint PK_Grade primary key clustered
)
insert Grade (Code) values ('A'),('B'),('C')
This then allows you to ensure that your query would still work if you decided to include grades D and E, without having to amend any code. It also ensures that you only have to query a small table to get the complete list of grades, rather than a potentially huge table; so will give better performance. Finally, it will also help you maintain good data; i.e. so you don't accidentally end up with students with grade X due to a typo; i.e. since the validation/constraints exist in the database.
select Name from Student s
where s.Id in
(
select sc.StudentId
from StudentCourse sc
group by sc.StudentId
having count(distinct sc.Grade) = (select count(Code) from Grade)
)
order by s.Name
Likewise, it's sensible to create a Course table. In this case holding Ids for each course; since holding the full course name in your StudentCourse table (as we're now calling it) uses up a lot more space and again lacks validation / constraints. As such, I'd propose amending your database schema to look like this:
create table Grade
(
Code char(1) not null constraint PK_Grade primary key clustered
)
insert Grade (Code) values ('A'),('B'),('C')
create table Course
(
Id bigint not null identity(1,1) constraint PK_Course primary key clustered
, Name nvarchar(128) not null constraint UK_Course_Name unique
)
insert Course (Name) values ('Math'),('English'),('Physics'),('Economics'),('Art'),('Biology')
create table Student
(
Id bigint not null identity(1,1) constraint PK_Student primary key clustered
,Name nvarchar(128) not null constraint UK_Student_Name unique
)
set identity_insert Student on --inserting with IDs to ensure the ids of these students match data from your question
insert Student (Id, Name)
values (1, 'John')
, (2, 'Paul')
, (3, 'George')
, (4, 'Mike')
, (5, 'Lisa')
set identity_insert Student off
create table StudentCourse
(
Id bigint not null identity(1,1) constraint PK_StudentCourse primary key
, StudentId bigint not null constraint FK_StudentCourse_StudentId foreign key references Student(Id)
, CourseId bigint not null constraint FK_StudentCourse_CourseId foreign key references Course(Id)
, Grade char /* allow null in case we use this table for pre-results; otherwise make non-null */ constraint FK_StudentCourse_Grade foreign key references Grade(Code)
, Constraint UK_StudentCourse_StudentAndCourse unique clustered (StudentId, CourseId)
)
insert StudentCourse (StudentId, CourseId, Grade)
select s.Id, c.Id, x.Grade
from (values
('John', 'Math', 'A')
,('John', 'English', 'B')
,('John', 'Physics', 'C')
,('Paul', 'Math', 'A')
,('Paul', 'English', 'A')
,('Paul', 'Physics', 'B')
,('George', 'Economics','A')
,('Mike', 'Art', 'C')
,('Lisa', 'Biology', 'A')
) x(Student, Course, Grade)
inner join Student s on s.Name = x.Student
inner join Course c on c.Name = x.Course

Convert a one-to-many relationship to many-to-many and update existing references

I have a one-to-many relationship which I've converted to a many-to-many relationship.
Example:
Main Table (
Id int,
Code varchar(2)
)
Secondary Table (
Id int,
Name varchar(250),
MainId int
)
I have the following entries in the Main table:
Id Code
1 A
2 B
3 C
Secondary table:
Id Name MainId
1 Foo 1
2 Bar 1
3 Foo 2
4 Bar 2
5 Bar 3
Since the values in the column 'Name' in the 'Secondary' table are repeated quite often, the db size has grown considerably, I've decided to convert into a many-to-many relationship and only reference unique 'Name' entries.
As a first step I've created the following join table:
MainSecondary Table (
MainId int,
SecondaryId int,
)
For the final step I need to update the existing references and delete duplicate records based on the 'Name' column, which is where I'm stuck (over a million records).
The intended outcome should be:
Main table:
Id Code
1 A
2 B
3 C
Secondary table:
Id Name
1 Foo
2 Bar
MainSecondary table:
MainId SecondaryId
1 (A) 1 (Foo)
1 (A) 2 (Bar)
2 (B) 1 (Foo)
2 (B) 2 (Bar)
3 (C) 1 (Foo)
Set-up
create table main
(
id int,
code varchar(2)
);
create table secondary
(
id int,
name varchar(250),
main_id int
);
insert into main (id, code) values (1, 'A');
insert into main (id, code) values (2, 'B');
insert into main (id, code) values (3, 'C');
insert into secondary (id, name, main_id) values (1, 'Foo', 1);
insert into secondary (id, name, main_id) values (2, 'Bar', 1);
insert into secondary (id, name, main_id) values (3, 'Foo', 2);
insert into secondary (id, name, main_id) values (4, 'Bar', 2);
insert into secondary (id, name, main_id) values (5, 'Bar', 3);
Create new_secondary table
create table new_secondary
(
id int,
name varchar(250)
);
Create new relationship table: main_secondary
create table main_secondary
(
main_id int,
secondary_id int
);
Populate new_secondary table, removing duplicates
insert into new_secondary
(
id,
name
)
select
min(id),
name
from
secondary
group by
name;
Populate main_secondary relationship table
insert into main_secondary
(
main_id,
secondary_id
)
select distinct
a.main_id,
b.id as secondary_id
from
secondary a
join
new_secondary b
on a.name = b.name;;
Check the results
select
a.id as main_id,
a.code,
c.id as secondary_id,
c.name
from
main a
join
main_secondary b
on a.id = b.main_id
join
secondary c
on c.id = b.secondary_id;
Results
main_id code secondary_id name
----------- ---- ------------ -------
1 A 1 Foo
2 B 1 Foo
1 A 2 Bar
2 B 2 Bar
3 C 2 Bar
(5 rows affected)
3 (C) 2 (Bar) is different from your example, but I think it's correct.
You would need to drop the old secondary table and rename the new_secondary table (when you are sure everything is OK) to keep things tidy.

How to make a common id for more then one row

i have a table in my database given like bellow
Requestid(primary key,identity) studentid reqid
1 1 bc1
2 1 bc1
3 2 bc2
I want to generate the same request id for student 1 if he is making more then one request.
I am using SQL server 2005 and Requested id is identity and student id will come when i submit my form but i want to generate reqid as automaticaly. It is same for the the student of same id and when next student submit it should change with the new id.
Plz hel me to solve it. Thanks in advance
If I understand what you want you could use a computed column to generate reqid.
create table StudentRequest
(
Requestid int identity primary key,
studentid int not null,
reqid as 'bc'+cast(studentid as varchar(10))
)
Test:
insert into StudentRequest (studentid) values (1)
insert into StudentRequest (studentid) values (1)
insert into StudentRequest (studentid) values (2)
select *
from StudentRequest
Result:
Requestid studentid reqid
----------- ----------- ------------
1 1 bc1
2 1 bc1
3 2 bc2

Add or delete repeated row

I have an output like this:
id name date school school1
1 john 11/11/2001 nyu ucla
1 john 11/11/2001 ucla nyu
2 paul 11/11/2011 uft mit
2 paul 11/11/2011 mit uft
I would like to achieve this:
id name date school school1
1 john 11/11/2001 nyu ucla
2 paul 11/11/2011 mit uft
I am using direct join as in:
select distinct
a.id, a.name,
b.date,
c.school
a1.id, a1.name,
b1.date,
c1.school
from table a, table b, table c,table a1, table b1, table c1
where
a.id=b.id
and...
Any ideas?
We will need more information such as what your tables contain and what you are after.
One thing I noticed is you have a school and then school1. 3nf states that you should never duplicate fields and append numbers to them to get more information even if you think that the relationship will only be 1 or 2 additional items. You need to create a second table that stores a user associated with 1 to many schools.
I agree with everyone else that both your source table and your desired output are poor design. While you probably can't do anything about your source table, I recommend the following code and output:
Select id, name, date, school from MyTable;
union
Select id, name, date, school1 from MyTable;
(repeat as necessary)
This will give you results in the format:
id name date school
1 john 11/11/2001 nyu
1 john 11/11/2001 ucla
2 paul 11/11/2011 mit
2 paul 11/11/2011 uft
(Note: in my version of SQL, union queries automatically select distinct records so the distinct flag isn't needed)
With this format, you could easily count the number of schools per student, number of students per school, etc.
If processing time and/or storage space is a factor here, you could then split this into 2 tables, 1 with the id,name & date, the other with the id & school (basically what JonH just said). But if you're just working up some simple statistics, this should suffice.
This problem was just too irresistable, so I just took a guess at the data structures that we are dealing with. The technology wasn't specified in the question. This is in Transact-SQL.
create table student
(
id int not null primary key identity,
name nvarchar(100) not null default '',
graduation_date date not null default getdate(),
)
go
create table school
(
id int not null primary key identity,
name nvarchar(100) not null default ''
)
go
create table student_school_asc
(
student_id int not null foreign key references student (id),
school_id int not null foreign key references school (id),
primary key (student_id, school_id)
)
go
insert into student (name, graduation_date) values ('john', '2001-11-11')
insert into student (name, graduation_date) values ('paul', '2011-11-11')
insert into school (name) values ('nyu')
insert into school (name) values ('ucla')
insert into school (name) values ('uft')
insert into school (name) values ('mit')
insert into student_school_asc (student_id, school_id) values (1,1)
insert into student_school_asc (student_id, school_id) values (1,2)
insert into student_school_asc (student_id, school_id) values (2,3)
insert into student_school_asc (student_id, school_id) values (2,4)
select
s.id,
s.name,
s.graduation_date as [date],
(select max(name) from
(select name,
RANK() over (order by name) as rank_num
from school sc
inner join student_school_asc ssa on ssa.school_id = sc.id
where ssa.student_id = s.id) s1 where s1.rank_num = 1) as school,
(select max(name) from
(select name,
RANK() over (order by name) as rank_num
from school sc
inner join student_school_asc ssa on ssa.school_id = sc.id
where ssa.student_id = s.id) s2 where s2.rank_num = 2) as school1
from
student s
Result:
id name date school school1
--- ----- ---------- ------- --------
1 john 2001-11-11 nyu ucla
2 paul 2011-11-11 mit uft