How do I make this SQL Query?

How do I make this SQL Query? - sql

Quite simply, I need to make a query that prints out all the years (in increasing order) in which only 1 exam took place. I am thinking something along the lines of having to query for every distinct year, and then use the count function to test if there was only one exam in that year? But I can't quite seem to write it out. If it is of importance, the program is being written in Java so I can manipulate outputs.
The form of the EXAM table is:
CREATE TABLE EXAM
(Student_id char(5), NOT NULL
Module_code varchar(6), NOT NULL
Exam_year smallint, NOT NULL
Score smallint, NOT NULL
PRIMARY KEY (Student_id, Module_code) -- Creates a unique tuple
FOREIGN KEY (Student_id) REFERENCES STUDENT(Student_id) -- Enforces data integrity
FOREIGN KEY (Module_code) REFERENCES MODULE(Module_code) -- Enforces data integrity
);

You need to use HAVING, and use distinct Module_code just in case.
SELECT Exam_year
FROM EXAM
GROUP BY Exam_year
HAVING COUNT(distinct Module_code) = 1
ORDER BY Exam_year

select Exam_year
from EXAM
group by module_code
having count(module_code) = 1
order by Exam_year asc

One query should get the job done. Try this:
select Exam_year
from EXAM
group by Exam_year
having count(*) = 1
order by Exam_year
If you need more information than just Exam_year in the output, this can be modified to support additional info, but this example is the simplest way to do what is stated in your requirement.

Related

Naming Convention for a dedicated ordering column

I have two entities: Course and Lesson. Course have many Lessons. Lesson list is to be ordered. I'm storing Course and Lesson entities into a relational database.
I want to have a column to order lessons. Later I'll SELECT Lessons and ORDER BY this column. Is there a naming convention how I should call this column?

Like #Magnus suggested, adding an order column for the lessons would be appropriate. Create a lesson table like so:
CREATE TABLE lesson (
lesson_id int,
course_id int,
lesson_title varchar2(50),
lesson_descr varchar2(255),
lesson_order int
);
Then you can use the following query to pull the lessons or a particular course in order:
SELECT lesson_title FROM lesson
WHERE course_id = '12345'
ORDER BY lesson_order ASC;

User table for many tables to login

I'm trying to build school management system and I'm having trouble designing an optimal database structure. I have Students, Staff and Users tables for login. User table will have login information only (userNumber, password) and Students and Staff will contain personal information. I separated Students and Staff because they contain different personal data. But they both have a userNumber.
users(
id,
userNumber,
password
)
students(
id,
studentNumber,
name,
age
)
staff(
id,
staffNumber,
name,
age,
salary,
dateOfHiring,
staffType
)
Let's say I'm login in with a userNumber 98242, how can let the system know where should I look, in Students table or Staff table?
I would like some recommendations on database structures.

just add column userType to users table

You could do a few things. You could create a type in the users table and look that up. You could also join the both the tables and then on recieving a record check if the student id or staff id has been returned.
Then your query could be something like
SELECT users.id as user_id, students.id, staff.id FROM users
LEFT JOIN students ON users.id = students.id
LEFT JOIN staff ON users.id = staff.id
WHERE id = 98242

Inheritance:
create table persons (
id,
name,
age
);
create table users (
number,
password
) inherits (persons);
create table students (
) inherits (users);
create table stuff (
salary,
dateOfHiring,
staffType
) inherits (users);
Schematically, something like this.
Using tableoid system column you could to know the origin of the particular row:
select
*,
tableoid::regclass -- Prints the origin table name (users, students, stuff, ...)
from users
where number = 98242;

While there are only few separate columns for students and staff members, I would keep it simple:
CREATE TABLE person (
person_id int GENERATED ALWAYS AS IDENTITY PRIMARY KEY
, name text
, birthday date -- never age! bitrots in no time
, student_number int
, staff_number int
, salary numeric
, hired_at date
, staff_type text
, CONSTRAINT one_role_max CHECK (student_number IS NULL
OR (staff_number, salary, hired_at, staff_type) IS NULL)
, CONSTRAINT one_role_min CHECK (student_number IS NOT NULL
OR (staff_number, salary, hired_at, staff_type) IS NOT NULL)
);
CREATE TABLE users (
user_number int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
, person_id int NOT NULL REFERENCES person
, password text -- encrypted !
);
This way, one person can have 0-n user accounts - which is the typical reality. You can restrict to a single account per person by adding UNIQUE (person_id) to table users.
The CHECK constraint one_role_max enforces that either student columns or staff columns must stay NULL.
The CHECK constraint one_role_min enforces that at least one of both must have any values.
Adapt what must/can be filled in to your needs. The expressions work excellently for the current design. See:
NOT NULL constraint over a set of columns
While it's strictly "either/or" and the only student column is student_number, this query answers your question:
SELECT CASE WHEN student_number IS NULL THEN 'staff' ELSE 'student' END AS user_role
FROM person
WHERE person_id = (SELECT person_id FROM users WHERE user_number = 98242);
Or remove one or both CHECK constraints to allow the same person to be student and staff, or neither. Adapt above query accordingly.
You could use inheritance for this (like Abelisto demonstrates), but I'd rather stay away from it. There once was the idea of an object-relational DBMS. But the community has largely moved on. It works, but with caveats. Partitioning used to be a major use case. But declarative partitioning in Postgres 10 mostly superseded the inheritance-based implementation. There is not too much interest in it any more.
What about all those empty columns? Am I wasting a lot of space there? The opposite is the case. The disk footprint won't get much smaller than this. NULL storage is very cheap. See:
Does not using NULL in PostgreSQL still use a NULL bitmap in the header?

Restrict the number of entries in a relation based on conditions across several relations

I am using PostgreSQL and am trying to restrict the number of concurrent loans that a student can have. To do this, I have created a CTE that selects all unreturned loans grouped by StudentID, and counts the number of unreturned loans for each StudentID. Then, I am attempting to create a check constraint that uses that CTE to restrict the number of concurrent loans that a student can have to 7 at most.
The below code does not work because it is syntactically invalid, but hopefully it can communicate what I am trying to achieve. Does anyone know how I could implement my desired restriction on loans?
CREATE TABLE loan (
id SERIAL PRIMARY KEY,
copy_id INTEGER REFERENCES media_copies (copy_id),
account_id INT REFERENCES account (id),
loan_date DATE NOT NULL,
expiry_date DATE NOT NULL,
return_date DATE,
WITH currentStudentLoans (student_id, current_loans) AS
(
SELECT account_id, COUNT(*)
FROM loan
WHERE account_id IN (SELECT id FROM student)
AND return_date IS NULL
GROUP BY account_id
)
CONSTRAINT max_student_concurrent_loans CHECK(
(SELECT current_loans FROM currentStudentLoans) BETWEEN 0 AND 7
)
);
For additional (and optional) context, I include an ER diagram of my database schema.

You cannot do this using an in-line CTE like this. You have several choices.
The first is a UDF and check constraint. Essentially, the logic in the CTE is put in a UDF and then a check constraint validates the data.
The second is a trigger to do the check on this table. However, that is tricky because the counts are on the same table.
The third is storing the total number in another table -- probably accounts -- and keeping it up-to-date for inserts, updates, and deletes on this table. Keeping that value up-to-date requires triggers on loans. You can then put the check constraint on accounts.
I'm not sure which solution fits best in your overall schema. The first is closest to what you are doing now. The third "publishes" the count, so it is a bit clearer what is going on.

Cannot insert into field with primary key constraint in PostgreSQL

I have a "raw" table that looks like this (among other many fields):
team_id | team_name
---------+-------------------------
1 | Team1
1 | Team1
2 | Team2
2 | Team2
I want to extract the team names and their id codes and create another table for them, so I created:
CREATE TABLE teams (
team_id integer NOT NULL,
team_name varchar(50) NOT NULL,
CONSTRAINT team_pkey PRIMARY KEY (team_id)
);
And I am planning to copy the data from the old table to the recently created one like this:
INSERT INTO teams(team_id,team_name)
SELECT team_id,team_name FROM rawtable
GROUP BY team_id, team_name;
At first I wasn't adding the GROUP BY part, and I was getting a message:
ERROR: duplicate key value violates unique constraint "team_pkey"
I added the GROUP BY so it doesn't try to insert more than one row for the same team, but the problem still persist and I keep getting the same message.
I don't understand what is causing it. It looks like I am inserting single non duplicate rows into the table. What's the best way to fix this?

If two different teams with the same id are in raw_table e.g. (1, 'foo') and (1, 'bar') the group by will still return both, because those two are different.
If you just want to pick one of the rows for duplicate values of team_id then you should use something like this:
insert into teams (team_id,team_name)
select distinct on (team_id) team_id, team_name
from rawtable
order by team_id;
The Postgres specific distinct on operator will make sure that only distinct values for team_id are returned.

My best guess is that you have the same team_id for more then one team_name at least somewhere in your table. Try to add `Having count(*)=1 to your select statement

Since the team_id is unique in the destination table, two separate team names with the same id will create duplicates, one row for each name.
A simple fix is to group by team_id so that you only get a single row per id, and pick one of the names the team has (here we somewhat arbitrarily use MAX to get the last in alphabetical order)
INSERT INTO teams(team_id,team_name)
SELECT team_id, MAX(team_name) FROM rawtable
GROUP BY team_id

One of your Team1 or Team2 probably has some extra spaces or nonprintable characters. This would cause your group by to return multiple rows for Team_ID 1 or 2 causing the problem.

Try to use distinct in your query :
insert into teams (team_id,team_name) select distinct on (team_id)
team_id, team_name from order by team_id;

Optimizing SQL queries

I have created some MSSQL queries, all of them work well, but I think it could be done in a faster way. Can you help me to optimize them?
That's the database:
Create table Teachers
(TNO char(3) Primary key,
TNAME char(20),
TITLE char(6) check (TITLE in('Prof','PhD','MSc')),
CITY char(12),
SUPNO char(3) REFERENCES Teachers);
Create table Students
(SNO char(3) Primary key,
SNAME char(20),
SYEAR int,
CITY char(20));
Create table Courses
(CNO char(3) Primary key,
CNAME char(20),
STUDYEAR int);
Create table TSC
(TNO char(3) REFERENCES Teachers,
SNO char(3) REFERENCES Students,
CNO char(3) REFERENCES Courses,
HOURS int,
GRADE float,
PRIMARY KEY(TNO,SNO,CNO));
1:
On which study year there are most courses?
Problem: it looks like the result is being sorted while I only need the max element.
select
top 1 STUDYEAR
from
Courses
group by
STUDYEAR
order by COUNT(*) DESC
2:
Show the TNOs of those teachers who do NOT have courses with the 1st studyear
Problem: I'm using a subquery only to negate a select query
select
TNO
from
Teachers
where
TNO not in (
select distinct
tno
from
Courses, TSC
where tsc.CNO=Courses.CNO and STUDYEAR = 1)

Some ordering needs to be done to find the max or min value; maybe using ranking functions instead of a group by would be better but I frankly expect the query analyzer to be smart enough to find a good query plan for this specific query.
The subquery is performing well as long as it isn't using columns from the outer query (which may cause it to be performed for every row in many cases). However, I'd leave away the distinct, as it has no benefit. Also, I'd always use the explicit join syntax, but that's mostly a matter of personal preference (for inner joins - outer joins should always be done with the explicit syntax).
So all in all I think that these queries are simple and clear enough to be handled well in the query analyzer, thereby yielding good performance. Do you have a specific performance issue for asking this question? If yes, give us more info (query plan etc.), if no, just leave them - don't to premature optimization.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How do I make this SQL Query? - sql

You need to use HAVING, and use distinct Module_code just in case. SELECT Exam_year FROM EXAM GROUP BY Exam_year HAVING COUNT(distinct Module_code) = 1 ORDER BY Exam_year

select Exam_year from EXAM group by module_code having count(module_code) = 1 order by Exam_year asc

Related

Naming Convention for a dedicated ordering column

User table for many tables to login

Restrict the number of entries in a relation based on conditions across several relations

Cannot insert into field with primary key constraint in PostgreSQL

Optimizing SQL queries

Categories

Resources