Database Technology - postgresql - sql

I am quite new to postgresql. Could any expert help me solve this problem please.
Consider the following PostgreSQL tables created for a university system recording which students take which modules:
CREATE TABLE module (id bigserial, name text);
CREATE TABLE student (id bigserial, name text);
CREATE TABLE takes (student_id bigint, module bigint);
Rewrite the SQL to include sensible primary keys.
CREATE TABLE module
(
m_id bigserial,
name text,
CONSTRAINT m_key PRIMARY KEY (m_id)
);
CREATE TABLE student
(
s_id bigserial,
name text
CONSTRAINT s_key PRIMARY KEY (s_id)
);
CREATE TABLE takes
(
student_id bigint,
module bigint,
CONSTRAINT t_key PRIMARY KEY (student_id)
);
Given this schema I have the following questions:
Write an SQL query to count how many students are taking DATABASE.
SELECT COUNT(name)
FROM student
WHERE module = 'DATABASE' AND student_id=s_id
Write an SQL query to show the student IDs and names (but nothing else) of all student taking DATABASE
SELECT s_id, name
FROM Student, take
WHERE module = 'DATABASE' AND student_id = s_id
Write an SQL query to show the student IDs and names (but nothing else) of all students not taking DATABASE.
SELECT s_id, name
FROM Student, take
WHERE student_id = s_id AND module != 'DATABASE'
Above are my answers. Please correct me if I am wrong and please comment the reason. Thank you for your expertise.

This looks like homework so I'm not going to give a detailed answer. A few hints:
I found one case where you used ยด quotes instead of ' apostrophes. This suggests you're writing SQL in something like Microsoft Word, which does so-called "smart quotes. Don't do that. Use a sensible text editor. If you're on Windows, Notepad++ is a popular choice. (Fixed it when reformatting the question, but wanted to mention it.)
Don't use the legacy non-ANSI join syntax JOIN table1, table2, table3 WHERE .... It's horrible to read and it's much easier to make mistakes with. You should never have been taught it in the first place. Also, qualify your columns - take.module not just module. Always write ANSI joins, e.g. in your example above:
FROM Student, take
WHERE module = 'DATABASE' AND student_id = s_id
becomes
FROM student
INNER JOIN take
ON take.module = 'DATABASE'
AND take.student_id = student.s_id;
(if the table names are long you can use aliases like FROM student s then s.s_id)
Query 3 is totally wrong. Imagine if take has two rows for a student, one where the student is taking database and one where they're taking cooking. Your query will still return a result for them, even though they're taking database. (It'd also return the same student ID multiple times, which you don't want). Think about subqueries. You will need to query the student table, using a NOT EXISTS (SELECT .... FROM take ...) to filter out students who are not taking database. The rest you get to figure out on your own.
Also, your schemas don't actually enforce the constraint that a student may only take DATABASE once at a time. Either add that, or consider in your queries the possibility that a student might be registered for DATABASE twice.

Related

Add logical constraint while designing tables

I have a table like below:
Student:
StudentId FirstName LastName Grade
Course:
CourseId Name Desc
Offering
OfferNum CourseId ProfessorId
Student_Course_Mapping
OfferNum StudentId
I have a constraint which says that student can enrol only in 2
offering for 1 term and year.
For eg: Lets say student john has enrol in Java and Python for Winter
term(January - April) 2020 than he cannot enrol in other courses.
I don't have any front end or anything else to restrict this. I only have been told to work with the database and apply this constraint.
Right now we are entering data manually in this table using SSMS design view.
So is it possible to restrict someone entering such data?
This type of numeric constraint between two tables is not trivial to implement.
Here are four approaches.
(1) Create a table with one row per student, term, and year. Have two columns, one for each course. It is not possible to have more than two with this approach.
(2) Create a user-defined function that calculates the number of courses for a student in a given term and year. Then create a check constraint using this function.
(3) Create a with one per student, term, and year and a count of the courses for the term. Maintain this count using a trigger and place a check constraint on the value.
(4) Maintain all the logic in the "application". This would entail doing all inserts via a stored procedure and having the stored procedure validate the business logic.
In your case, I would probably opt for (2) as the last intrusive solution. If you already had a table with one row per student/term/year, I would suggest (3). It does involve a trigger (argggh!) but it also maintains useful information for other purposes.

Table field naming convention and SQL statements

I have a practical question regarding naming table fields in a database. For example, I have two tables:
student (id int; name varchar(30))
teacher (id int, s_id int; name varchar(30))
There are both 'id' and "name" in two tables. In SQL statement, it will be ambiguous for the two if no table names are prefixed. Two options:
use Table name as prefix of a field in SQL 'where' clause
use prefixed field names in tables so that no prefix will be used in 'where' clause.
Which one is better?
Without a doubt, go with option 1. This is valid sql in any type of database and considered the proper and most readable format. It's good habit to prefix the table name to a column, and very necessary when doing a join. The only exception to this I've most often seen is prefixing the id column with the table name, but I still wouldn't do that.
If you go with option 2, seasoned DBA's will probably point and laugh at you.
For further proof, see #2 here: https://www.periscopedata.com/blog/better-sql-schema.html
And here. Rule 1b - http://www.isbe.net/ILDS/pdf/SQL_server_standards.pdf
As TT mentions, you'll make your life much easier if you learn how to use an alias for the table name. It's as simple as using SomeTableNameThatsWayTooLong as long_table in your query, such as:
SELECT LT.Id FROM SomeTableNameThatsWayTooLong AS LT
For queries that aren't ad-hoc, you should always prefix every field with either the table name or table alias, even if the field name isn't ambiguous. This prevents the query from breaking later if someone adds a new column to one of the tables that introduces ambiguity.
So that would make "id" and "name" unambiguous. But I still recommend naming the primary key with something more specific than "id". In your example, I would use student_id and teacher_id. This helps prevent mistakes in joins. You will need more specific names anyway when you run into tables with more than one unique key, or multi-part keys.
It's worth thinking these things through, but in the end consistency may be the more important factor. I can deal with tables built around id instead of student_id, but I'm currently working with an inconsistent schema that uses all of the following: id, sid, systemid and specific names like taskid. That's the worst of both worlds.
I would use aliases rather than table names.
You can assign an alias to a table in a query, one that is shorter than the table name. That makes the query a lot more readable. Example:
SELECT
t.name AS teacher_name,
s.name AS student_name
FROM
teacher AS t
INNER JOIN student AS s ON
s.id=t.s_id;
You can of course use the table name if you don't use aliases, and that would be preferred over your option 2.
If it doesn't get too long, I prefer prefixing in the table themselves, e.g. teacher.teacher_id, student.student_name. That way, you are always sure which name or id your are talking about, even if you for get to prefix the table name.

SQL Basics: How to get details from multiple tables in one query?

I've used SQL for years but have never truly harnessed its potential.
For this example let's say I have two tables:
CREATE TABLE messages (
MessageID INTEGER NOT NULL PRIMARY KEY,
UserID INTEGER,
Timestamp INTEGER,
Msg TEXT);
CREATE TABLE users (
UserID INTEGER NOT NULL PRIMARY KEY,
UserName TEXT,
Age INTEGER,
Gender INTEGER,
WebURL TEXT);
As I understand it, PRIMARY KEY basically indexes the field so that it can be used as a rapid search later on - querying based on exact values of the primary key gets results extremely quickly even in huge tables. (which also enforces that the field must be unqiue in each record.)
In my current workflow, I'd do something like
SELECT * FROM messages;
and then in code, for each message, do:
SELECT * FROM users WHERE UserID = results['UserID'];
This obviously sounds very inefficient and I know it can be done a lot better.
What I want to end up with is a result set that contains all of the fields from messages, except that instead of the UserID field, it contains all of the fields from the users table that match that given UserID.
Could someone please give me a quick primer on how this sort of thing can be accomplished?
If it matters, I'm using SQLite3 as an SQL engine, but I also would possibly want to do this on MySQL.
Thank you!
Not sure about the requested order, but you can adapt it.
Just JOIN the tables on UserID
SELECT MESSAGES.*,
USERS.USERNAME,
USERS.AGE,
USERS.GENDER,
USERS.WEBURL
FROM MESSAGES
JOIN USERS
ON USERS.USERID = MESSAGES.USERID
ORDER BY MESSAGES.USERID,
MESSAGES.TIMESTAMP

SQL: unique constraint and aggregate function

Presently I'm learning (MS) SQL, and was trying out various aggregate function samples. The question I have is: Is there any scenario (sample query that uses aggregate function) where having a unique constraint column (on a table) helps when using an aggregate function.
Please note: I'm not trying to find a solution to a problem, but trying to see if such a scenario exist in real world SQL programming.
One immediate theoretical scenario comes to mind, the unique constraint is going to be backed by a unique index, so if you were just aggregating that field, the index would be narrower than scanning the table, but that would be on the basis that the query didn't use any other fields and was thus covering, otherwise it would tip out of the NC index.
I think the addition of the index to enforce the unique constraint is automatically going to have the ability to potentially help a query, but it might be a bit contrived.
Put the unique constraint on the field if you need the field to be unique, if you need some indexes to help query performance, consider them seperately, or add a unique index on that field + include other fields to make it covering (less useful, but more useful than the unique index on a single field)
Let's take following two tables, one has records for subject name and subject Id and another table contains record for student having marks in particular subjects.
Table1(SubjectId int unique, subject_name varchar, MaxMarks int)
Table2(Id int, StudentId int, SubjectId, Marks int)
so If I need to find AVG of marks obtained in Science subject by all student who have attempted for
Science(SubjectId =2) then I would fire following query.
SELECT AVG(Marks), MaxMarks
FROM Table1, Table2
WHERE Table1.SubjectId = 2;

Sql naming best practice

I'm not entirely sure if there's a standard in the industry or otherwise, so I'm asking here.
I'm naming a Users table, and I'm not entirely sure about how to name the members.
user_id is an obvious one, but I wonder if I should prefix all other fields with "user_" or not.
user_name
user_age
or just name and age, etc...
prefixes like that are pointless, unless you have something a little more arbitrary; like two addresses. Then you might use address_1, address_2, address_home, etc
Same with phone numbers.
But for something as static as age, gender, username, etc; I would just leave them like that.
Just to show you
If you WERE to prefix all of those fields, your queries might look like this
SELECT users.user_id FROM users WHERE users.user_name = "Jim"
When it could easily be
SELECT id FROM users WHERE username = "Jim"
I agree with the other answers that suggest against prefixing the attributes with your table names.
However, I support the idea of using matching names for the foreign keys and the primary key they reference1, and to do this you'd normally have to prefix the id attributes in the dependent table.
Something which is not very well known is that SQL supports a concise joining syntax using the USING keyword:
CREATE TABLE users (user_id int, first_name varchar(50), last_name varchar(50));
CREATE TABLE sales (sale_id int, purchase_date datetime, user_id int);
Then the following query:
SELECT s.*, u.last_name FROM sales s JOIN users u USING (user_id);
is equivalent to the more verbose and popular joining syntax:
SELECT s.*, u.last_name FROM sales s JOIN users u ON (u.user_id = s.user_id);
1 This is not always possible. A typical example is a user_id field in a users table, and reported_by and assigned_to fields in the referencing table that both reference the users table. Using a user_id field in such situations is both ambiguous, and not possible for one of the fields.
As other answers suggest, it is a personal preference - pick up certain naming schema and stick to it.
Some 10 years ago I worked with Oracle Designer and it uses naming schema that I like and use since then:
table names are plural - USERS
surrogate primary key is named as singular of table name plus '_id' - primary key for table USERS would be "USER_ID". This way you have consistent naming when you use "USER_ID" field as foreign key in some other table
column names don't have table name as prefix.
Optionally:
in databases with large number of tables (interpret "large" as you see fit), use 2-3
characters table prefixes so that you can logically divide tables in areas. For example: all tables that contain sales data (invoices, invoice items, articles) have prefix "INV_", all tables that contain human resources data have prefix "HR_". That way it is easier to find and sort tables that contain related data (this could also be done by placing tables in different schemes and setting appropriate access rights, but it gets complicated when you need to create more than one database on one server)
Again, pick naming schema you like and be consistent.
Just go with name and age, the table should provide the necessary context when you're wondering what kind of name you're working with.
Look at it as an entity and name the fields accordingly
I'd suggest a User table, with fields such as id, name, age, etc.
A group of records is a bunch of users, but the group of fields represents a user.
Thus, you end up referring to user.id, user.name, user.age (though you won't always include the table name, depending on the query).
For the table names, I usually use pluralized nouns (or noun phrases), like you.
For column names I'd not use the table name as prefix. The table itself specifies the context of the column.
table users (plural):
id
name
age
plain and simple.
It's personal preference. The best advice we can give you is consistency, legibility and ensuring the relationships are correctly named as well.
Use names that make sense and aren't abbreviated if possible, unless the storage mechanism you are using doesn't work well with them.
In relationships, I like to use Id on the primary key and [table_name]_Id on the foreign key. eg. Order.Id and OrderItem.OrderId
Id works well if using a surrogate key as a primary key.
Also your storage mechanism may or may not be case sensitive, so be sure to that into account.
Edit: Also, thre is some theory to suggest that table should be name after what a single record in that table should represent. So, table name "User" instead of "Users" - personally the plural makes more sense to me, just keep it consistent.
First of all, I would suggest using the singular noun, i.e. user instead of users, although this is more of a personal preference.
Second, there are some who prefer to always name the primary key column id, instead of user_id (i.e. table name + id), and similar with for example name instead of employee_name. I think this is a bad idea for the following reason:
-- when every table has an "id" (or "name") column, you get duplicate column names in the output:
select e.id, e.name, d.id, d.name
from employee e, department d
where e.department_id = d.id
-- to avoid this, you need to specify column aliases every time you query:
select e.id employee_id, e.name employee_name, d.id department_id, d.name department_name
from employee e, department d
where e.department_id = d.id
-- if the column name includes the table, there are no conflicts, and the join condition is very clear
select e.employee_id, e.employee_name, d.department_id, d.department_name
from employee e, department d
where e.department_id = d.department_id
I'm not saying you should include the table name in every column in the table, but do it for the key (id) column and other "generic" columns such as name, description, remarks, etc. that are likely to be included in queries.
I explicitly named my columns using a prefix that was related to the table
i.e. table = USERS, column name = user_id, user_name, user_address_street, etc.
before that, when i started using JOINS I had to alias the crap out of the column names to avoid conflict in the query results, and then when accessed from templates in a MVC View, if the query result field name didn't match the published db schema, the template designers would get all confused and have to ask for the SQL VIEW to determine the correct field name to use.
So it looks messy to use a prefix in a column name, but in practice it works better for us.
I'm not entirely sure if there's a standard in the industry
Yes: ISO 11179-5: Naming and identification principles, available here.
I think table and column names must be like that.
Table Name :
User --> Capitalize Each Word and not plural.
Column Names :
Id --> If i see "Id" I understand this is PK column.
GroupId --> I understanding there is an table which named Group and this column is relation column for Group table.
Name --> If there is a column which named "Name" in User table, this means name of user. It's enaughly clear.
Especially if you are using Entity Framework I suppose this more.
Note: Sorry for my bad English. If somebody will correct my bad English i will be happy.