SQL gym with less workers per district - sql

I'm struggling to find how to make a specific query to a set of tables that I have on my local database.
CREATE TABLE Gym (
eid INT PRIMARY KEY,
name VARCHAR(127) UNIQUE,
district VARCHAR(127),
area INT);
CREATE TABLE Trainer (
id INT PRIMARY KEY,
name VARCHAR(127),
birth_year INT,
year_credentials_expiry INT
);
CREATE TABLE Works (
eid INT,
id INT,
since INT,
FOREIGN KEY (eid) REFERENCES Gym (eid),
FOREIGN KEY (id) REFERENCES Trainer (id),
PRIMARY KEY (eid,id));
The question is the following: I have several gyms and there are some cases where two or more gyms are located on the same district. How can I know, by district, which is the gym with less trainers on it?
The only thing I managed to get is the number of trainers per gym. Considering that, I can only get the gym with the minimum trainers from all districts...
NOTE: I am NOT allowed to use subqueries (SELECT inside SELECT's; SELECT inside FROM's)
Thank you in advance.

If you can have subqueries in the where or having clauseclause, then you can approach it this way.
First look at your query that counts the number of trainers by district.
Now, write a query (using that as a subquery) that calculates the minimum number of trainers in a gym in a district.
Next, take your first query, add a having clause and correlate it to the second by district and number of trainers.
This does use a subquery in a subquery, but it is in the having clause. I'm not writing the query for you, since you need to learn how to do that yourself.
By the way, if you have window/analytic functions there may be other solutions.

Related

Constraint is not enough?

I have a very simple Database consist of;
Person_id, City and is_manager columns. is_manager can have TRUE or FALSE bit (0,1).
There are some restrictions that I manage to complete, such as;
each city can have one or more than one person (unique
person_id+City)
each person can be either manager or not in a city. (unique person+city+is_manager)
each city can have only one manager
but can have more than one non-managers.
not every person has to have a city, some might not be assigned to a
city at all.
I managed to do first two constraints easily but I couldn't manage for the third condition. because each city can have one manager, but can have more than one non_manager positions.
I have some other methodology to solve it but it will be not 100% on server-side solution. I want to have 100% SQL Server side solution.
I think it can be done with trigger;
if so, how? (when is_manager TRUE comes, search whole database for that if city has any manager before, if not accept data, otherwise don't accept)
if not necessary, how with constraints?
A sample DB
create table Person(
id int primary key,
cityId int,
constraint UK1 unique(cityId, id)
);
create table City (
id int primary key,
managerId int,
constraint FK1 foreign key(id, managerId) references Person(cityId, id)
);
Business process is first assign a person to a City then make him/her a manager of the City.
db<>fiddle including test data

Benefits of using an autogenerated primary key instead of a constant unique name

I've heard that having an autogenerated primary key is a convention. However, I'm trying to understand its benefits in the following scenario:
CREATE TABLE countries
(
countryID int(11) PRIMARY KEY AUTO_INCREMENT,
countryName varchar(128) NOT NULL
);
CREATE TABLE students
(
studentID int(11) PRIMARY KEY AUTO_INCREMENT,
studentName varchar(128) NOT NULL,
countryOfOrigin int(11) NOT NULL,
FOREIGN KEY (countryOfOrigin) REFERENCES countries (countryID)
);
INSERT INTO countries (countryName)
VALUES ('Germany'), ('Sweden'), ('Italy'), ('China');
If I want to insert something into the students table, I need to lookup the countryIDs in the countries table:
INSERT INTO students (studentName, countryOfOrigin)
VALUES ('Benjamin Schmidt', (SELECT countryID FROM countries WHERE countryName = 'Germany')),
('Erik Jakobsson', (SELECT countryID FROM countries WHERE countryName = 'Sweden')),
('Manuel Verdi', (SELECT countryID FROM countries WHERE countryName = 'Italy')),
('Min Lin', (SELECT countryID FROM countries WHERE countryName = 'China'));
In a different scenario, as I know that the countryNames in the countries table are unique and not null, I could to the following:
CREATE TABLE countries2
(
countryName varchar(128) PRIMARY KEY
);
CREATE TABLE students2
(
studentID int(11) PRIMARY KEY AUTO_INCREMENT,
studentName varchar(128) NOT NULL,
countryOfOrigin varchar(128) NOT NULL,
FOREIGN KEY (countryOfOrigin) REFERENCES countries2 (countryName)
);
INSERT INTO countries2 (countryName)
VALUES ('Germany'), ('Sweden'), ('Italy'), ('China');
Now, inserting data into the students2 table is simpler:
INSERT INTO students2 (studentName, countryOfOrigin)
VALUES ('Benjamin Schmidt', 'Germany'),
('Erik Jakobsson', 'Sweden'),
('Manuel Verdi', 'Italy'),
('Min Lin', 'China');
So why should the first option be the better one, given that countryNames are unique and are never going to change?
There are two apects involved here:
natural keys vs. surrogate keys
autoincremented values
You are wondering why to have to deal with some arbitrary number for a country, when a country can be uniquely identified by its name. Well, imagine you use the country names in several tables to relate rows to each other. Then at some point you are told that you misspelled a country. You want to correct this, but have to do this in every table the country occurs in. In big databases you usually don't have cascading updates in order to avoid updates that unexpectedly take hours instead of mere minutes or seconds. So you must do this manually, but the foreign key constraints get in your way. You cannot change the parent table's key, because there are child tables using this, and you cannot change the key in the child tables first, because that key has to exist in the parent table. You'll have to work with a new row in the parent table and start from there. Quite some task. And even if you have no spelling issue, at some point someone might say "we need the official country names; you have China, but it must be the People's Republic of China instead" and again you must look up and change that contry in all those tables. And what about partial backups? A table gets totally messed up due to some programming error and must be replaced by last week's backup, because this is the best you have. But suddenly some keys don't match any more. You never want a table's key to change.
You say "country names are unique and are never going to change". Think again :-)
It is easier to have your database use a technical arbitrary ID instead. Then the country name only exists in the country table. And if that name must get changed, you change it just in that one place, and all relations stay intact. This, however, doesn't mean that natural keys are worse than technical IDs. They are not. But it's more difficult with them to set up a database correctly. In case of countries a good natural key would be a country ISO code, because this uniquely identifies a country and doesn't change. This would be my choice here.
With students it's the same. Students usually have a student number or student ID in real world, so you can simply use this number to uniquely identifiy a student in the database. But then, how do we get these unique student IDs? At a large university, two secretaries may want to enter new students at the same time. They ask the system what the last student's ID was. It was #11223, so they both want to issue #11224, which causes a conflict of course, because only one student can be given that number. In order to avoid this, DBMS offer sequences of which numbers are taken. Thus one of the secretaries pulls #11224 and the other #11225. Auto-incremented IDs work this way. Both secretaries enter their new student, the rows get inserted into the student table and result in the two different IDs that get reported back to the secretaries. This makes sequences and auto incrementing IDs a great and safe tool to work with.
Convention can be a useful guide. It isn't necessarily the best option in all situations.
There are usually tradeoffs involved, like space, convenience, etc.
While you showed one method of resolving / inserting the proper country key value, there's a slightly less wordy option supported by standard SQL (and many databases).
INSERT INTO students (studentName, countryOfOrigin)
WITH list (name, country) AS (
SELECT *
FROM (
VALUES ('Benjamin Schmidt', 'Germany')
, ('Erik Jakobsson', 'Sweden')
, ('Manuel Verdi', 'Italy')
, ('Min Lin', 'China')
) AS x
)
SELECT name, countryID
FROM list AS l
JOIN countries AS c
ON c.countryName = l.country
;
and a little less wordy again:
INSERT INTO students (studentName, countryOfOrigin)
WITH list (name, country) AS (
VALUES ('Benjamin Schmidt', 'Germany')
, ('Erik Jakobsson', 'Sweden')
, ('Manuel Verdi', 'Italy')
, ('Min Lin', 'China')
)
SELECT name, countryID
FROM list AS l
JOIN countries AS c
ON c.countryName = l.country
;
Here's a test case with MariaDB 10.5:
Working test case (updated)

Indicate that a table's attribute is derived in SQL

Is there a way in SQL to specify that an attribute is derived? Currently, I'm creating a table Employee, which has a derived attribute age, but I've no idea how to indicate it (and I'm afraid there's no way to do it):
create table Employee(
Id int NOT NULL UNIQUE PRIMARY KEY AUTO_INCREMENT,
Age int, # How to indicate this is a derived attribute?
Country varchar(255),
City varchar(255),
Birthrate double,
);
If what You mean is that age is derived from birthrate, I think You should consider if You really need age column in Your table at all. If You are not creating data warehouse here, calculating age each time will be much saner than maintaining this column in production... Think of it - this column should be recalculated everytime someone have a birthday!
For convenience, You can always create view for it, like this:
CREATE VIEW Employee_view AS
SELECT
e.Id,
=>here some database specyfic calculation to calculate age<= as Age,
e.Country,
e.City,
e.Birthrate
FROM Employee e;

Optimizing SQL queries

I have created some MSSQL queries, all of them work well, but I think it could be done in a faster way. Can you help me to optimize them?
That's the database:
Create table Teachers
(TNO char(3) Primary key,
TNAME char(20),
TITLE char(6) check (TITLE in('Prof','PhD','MSc')),
CITY char(12),
SUPNO char(3) REFERENCES Teachers);
Create table Students
(SNO char(3) Primary key,
SNAME char(20),
SYEAR int,
CITY char(20));
Create table Courses
(CNO char(3) Primary key,
CNAME char(20),
STUDYEAR int);
Create table TSC
(TNO char(3) REFERENCES Teachers,
SNO char(3) REFERENCES Students,
CNO char(3) REFERENCES Courses,
HOURS int,
GRADE float,
PRIMARY KEY(TNO,SNO,CNO));
1:
On which study year there are most courses?
Problem: it looks like the result is being sorted while I only need the max element.
select
top 1 STUDYEAR
from
Courses
group by
STUDYEAR
order by COUNT(*) DESC
2:
Show the TNOs of those teachers who do NOT have courses with the 1st studyear
Problem: I'm using a subquery only to negate a select query
select
TNO
from
Teachers
where
TNO not in (
select distinct
tno
from
Courses, TSC
where tsc.CNO=Courses.CNO and STUDYEAR = 1)
Some ordering needs to be done to find the max or min value; maybe using ranking functions instead of a group by would be better but I frankly expect the query analyzer to be smart enough to find a good query plan for this specific query.
The subquery is performing well as long as it isn't using columns from the outer query (which may cause it to be performed for every row in many cases). However, I'd leave away the distinct, as it has no benefit. Also, I'd always use the explicit join syntax, but that's mostly a matter of personal preference (for inner joins - outer joins should always be done with the explicit syntax).
So all in all I think that these queries are simple and clear enough to be handled well in the query analyzer, thereby yielding good performance. Do you have a specific performance issue for asking this question? If yes, give us more info (query plan etc.), if no, just leave them - don't to premature optimization.

How to design relation between tables employee,client and phone Number?

I have a relational database with a Client table, containing id, name, and address, with many phone numbers
and I have an Employee table, also containing id, name, address, etc., and also with many phone numbers.
Is it more logical to create one "Phone Number" table and link the Clients and Employees, or to create two separate "Phone Number" tables, one for Clients and one for Employees?
If I choose to create one table, can I use one foreign key for both the Client and Employee or do I have to make two foreign keys?
If I choose to make one foreign key, will I have to make the Client ids start at 1 and increment by 5, and Employee ids start at 2 and increment by 5 so the two ids will not be the same?
If I create two foreign keys will one have a value and the other allow nulls?
The solution which I would go with would be:
CREATE TABLE Employees (
employee_id INT NOT NULL,
first_name VARCHAR(30) NOT NULL,
...
CONSTRAINT PK_Employees PRIMARY KEY (employee_id)
)
CREATE TABLE Customers (
customer_id INT NOT NULL,
customer_name VARCHAR(50) NOT NULL,
...
CONSTRAINT PK_Customers PRIMARY KEY (customer_id)
)
-- This is basic, only supports U.S. numbers, and would need to be changed to
-- support international phone numbers
CREATE TABLE Phone_Numbers (
phone_number_id INT NOT NULL,
area_code CHAR(3) NOT NULL,
prefix CHAR(3) NOT NULL,
line_number CHAR(4) NOT NULL,
extension VARCHAR(10) NULL,
CONSTRAINT PK_Phone_Numbers PRIMARY KEY (phone_number_id),
CONSTRAINT UI_Phone_Numbers UNIQUE (area_code, prefix, line_number, extension)
)
CREATE TABLE Employee_Phone_Numbers (
employee_id INT NOT NULL,
phone_number_id INT NOT NULL,
CONSTRAINT PK_Employee_Phone_Numbers PRIMARY KEY (employee_id, phone_number_id)
)
CREATE TABLE Customer_Phone_Numbers (
customer_id INT NOT NULL,
phone_number_id INT NOT NULL,
CONSTRAINT PK_Customer_Phone_Numbers PRIMARY KEY (customer_id, phone_number_id)
)
Of course, the model might changed based on a lot of different things. Can an employee also be a customer? If two employees share a phone number how will you handle it on the front end when the phone number for one employee is changed? Will it change the number for the other employee as well? Warn the user and ask what they want to do?
Those last few questions don't necessarily affect how the data is ultimately modeled, but will certainly affect how the front-end is coded and what kind of stored procedures you might need to support it.
"The Right Way", allowing you to use foreign keys for everything, would be to have a fourth table phoneNumberOwner(id) and have fields client.phoneNumberOwnerId and employee.phoneNumberOwnerId; thus, each client and each employee has its own record in the phoneNumberOwner table. Then, your phoneNumbers table becomes (phoneNumberOwnerId, phoneNumber), allowing you to attach multiple phone numbers to each phoneNumberOwner record.
Maybe you can somehow justify it, but to my way of thinking it is not logical to have employees and clients in the same table. It seems you wan to do this only so that your foreign keys (in the telephone-number table) all point to the same table. This is not a good reason for combining employees and clients.
Use three tables: employees, clients, and telephone-number. In the telephone table, you can have a field that indicates employee or client. As an aside, I don't see why telephone number needs to be a foreign key: that only adds complexity with very little benefit, imo.
Unless there are special business requirements I would expect a telephone number to be an attribute of an employee or client entity and not an entity in its own right.
If it were considered an entity in its own right it would be 'all key' i.e. its identifier is the compound of its attributes and has no attributes other than its identifier. If the sub-attributes aren't stored apart then it only has one attribute i.e. the telephone number itself! Therefore, it isn't usually 'interesting' enough to be an entity in its own right and a telephone numbers table, whether superclass or subclass, is usually overkill (as I say, barring special business requirements).