Multi-table, multi-row SQL select - sql

How would I list all of the info about a freelancer given the schema below? Including niche, language, market, etc. The issue I am having is that every freelancer can have multiple entries for each table. So, how would I do this? Is it even possible using SQL or would I need to use my primary language (golang) for this?
CREATE TABLE freelancer (
freelancer_id SERIAL PRIMARY KEY,
ip inet NOT NULL,
username VARCHAR(20) NOT NULL,
password VARCHAR(100) NOT NULL,
email citext NOT NULL UNIQUE,
email_verified int NOT NULL,
fname VARCHAR(20) NOT NULL,
lname VARCHAR(20) NOT NULL,
phone_number VARCHAR(30) NOT NULL,
address VARCHAR(50) NOT NULL,
city VARCHAR(30) NOT NULL,
state VARCHAR(30) NOT NULL,
zip int NOT NULL,
country VARCHAR(30) NOT NULL,
);
CREATE TABLE market (
market_id SERIAL PRIMARY KEY,
market_name VARCHAR(30) NOT NULL,
);
CREATE TABLE niche (
niche_id SERIAL PRIMARY KEY,
niche_name VARCHAR(30) NOT NULL,
);
CREATE TABLE medium (
medium_id SERIAL PRIMARY KEY,
medium_name VARCHAR(30) NOT NULL,
);
CREATE TABLE format (
format_id SERIAL PRIMARY KEY,
format_name VARCHAR(30) NOT NULL,
);
CREATE TABLE lang (
lang_id SERIAL PRIMARY KEY,
lang_name VARCHAR(30) NOT NULL,
);
CREATE TABLE freelancer_by_niche (
id SERIAL PRIMARY KEY,
niche_id int NOT NULL REFERENCES niche (niche_id),
freelancer_id int NOT NULL REFERENCES freelancer (freelancer_id)
);
CREATE TABLE freelancer_by_medium (
id SERIAL PRIMARY KEY,
medium_id int NOT NULL REFERENCES medium (medium_id),
freelancer_id int NOT NULL REFERENCES freelancer (freelancer_id)
);
CREATE TABLE freelancer_by_market (
id SERIAL PRIMARY KEY,
market_id int NOT NULL REFERENCES market (market_id),
freelancer_id int NOT NULL REFERENCES freelancer (freelancer_id)
);
CREATE TABLE freelancer_by_format (
id SERIAL PRIMARY KEY,
format_id int NOT NULL REFERENCES format (format_id),
freelancer_id int NOT NULL REFERENCES freelancer (freelancer_id)
);
CREATE TABLE freelancer_by_lang (
id SERIAL PRIMARY KEY,
lang_id int NOT NULL REFERENCES lang (lang_id),
freelancer_id int NOT NULL REFERENCES freelancer (freelancer_id)
);

SELECT *
FROM freelancer
INNER JOIN freelancer_by_niche USING (freelancer_id)
INNER JOIN niche USING (niche_id)
INNER JOIN freelancer_by_medium USING (freelancer_id)
INNER JOIN medium USING (medium_id)
INNER JOIN freelancer_by_market USING (freelancer_id)
INNER JOIN market USING (market_id)
INNER JOIN freelancer_by_format USING (freelancer_id)
INNER JOIN format USING (format_id)
INNER JOIN freelancer_by_lang USING (freelancer_id)
INNER JOIN lang USING (lang_id);
And if you want to lose the unnecessary attributes from join tables like freelancer_by_format, then you can do this
SELECT a.ip, a.username, a.password, a.email, a.email_verified,
a.fname, a.lname, a.phone_number, a.address, a.city,
a.state, a.zip, a.country,
b.niche_name, c.medium_name, d.market_name, e.format_name, f.lang_name
FROM freelancer a
INNER JOIN freelancer_by_niche USING (freelancer_id)
INNER JOIN niche b USING (niche_id)
INNER JOIN freelancer_by_medium USING (freelancer_id)
INNER JOIN medium c USING (medium_id)
INNER JOIN freelancer_by_market USING (freelancer_id)
INNER JOIN market d USING (market_id)
INNER JOIN freelancer_by_format USING (freelancer_id)
INNER JOIN format e USING (format_id)
INNER JOIN freelancer_by_lang USING (freelancer_id)
INNER JOIN lang f USING (lang_id);
And if you want to change the column names, for example change "market_name" to just "market", then you go with
SELECT a.ip, ... ,
d.market_name "market", e.format_name AS "format", ...
FROM ...
Remarks
In your join tables (for example freelancer_by_niche) there is not UNIQUE constraint on freelancer_id, which means that you could have the same freelancer in multiple markets (that's ok and probably intended).
But then you also don't have a UNIQUE constraint on both attributes (freelancer_id, niche_id), which means that every freelancer could be in the SAME niche multiple times. ("Joe is in electronics. Three times").
You could prevent that by making (freelancer_id, niche_id) UNIQUE in freelancer_by_niche.
This way you would also not need a surrogate (artificial) PRIMARY KEY freelancer_by_id (id).
So what could go wrong then?
For example imagine the same information about a freelancer in the same niche three times (the same data parts of the row three times):
freelancer_by_niche
id | freelancer_id | niche_id
1 | 1 | 1 -- <-- same data (1, 1), different serial id
2 | 1 | 1 -- <-- same data (1, 1), different serial id
3 | 1 | 1 -- <-- same data (1, 1), different serial id
Then the result of the above query would return each possible row three (!) times with the same (!) content, because freelancer_by_niche can be combined three times with all the other JOINs.
You can eliminate duplicates by using SELECT DISTINCT a.id, ... FROM ... above with DISTINCT.
What if you get many duplicate rows, for example 10 data duplicates in each of the 5 JOIN tables (freelancer_by_niche, freelancer_by_medium etc)? You would get 10 * 10 * 10 * 10 * 10 = 10 ^ 5 = 100000 duplicates, which all have the exact same information.
If you then ask your DBMS to eliminate duplicates with SELECT DISTINCT ... then it has to sort 100000 duplicate rows per different row, because duplicates can be detected by sorting only (or hashing, but never mind). If you have 1000 different rows for freelancers on markets, niches, languages etc, then you are asking your DBMS to SORT 1.000 * 100.000 = 100.000.000 rows to reduce the duplicates down to the unique 1000 rows.
That is 100 million unnecessary rows.
Please make UNIQUE (freelancer_id, niche_id) for freelancer_by_niche and the other JOIN tables.
(By data duplicates i mean that the data (niche_id, freelancer_id) is the same, and only the id is auto incremented serial.)
You can easily reproduce the problem by doing the following:
-- this duplicates all data of your JOIN tables once. Do it many times.
INSERT INTO freelancer_by_niche
SELECT (niche_id, freelancer_id) FROM freelancer_by_niche;
INSERT INTO freelancer_by_medium
SELECT (medium_id, freelancer_id) FROM freelancer_by_medium;
INSERT INTO freelancer_by_market
SELECT (market_id, freelancer_id) FROM freelancer_by_market;
INSERT INTO freelancer_by_format
SELECT (format_id, freelancer_id) FROM freelancer_by_format;
INSERT INTO freelancer_by_lang
SELECT (lang_id, freelancer_id) FROM freelancer_by_lang;
Display the duplicates using
SELECT * FROM freelancer_by_lang;
Now try the SELECT * FROM freelancer INNER JOIN ... thing.
If it still runs fast, then do all the INSERT INTO freelancer_by_niche ... again and again, until it takes forever to calculate the results.
(or you get duplicates, which you can remove with DISTINCT).
Create UNIQUE data JOIN tables
You can prevent duplicates in your join tables.
Remove the id SERIAL PRIMARY KEY and replace it with a multi-attribute PRIMARY KEY (a, b):
CREATE TABLE freelancer_by_niche (
niche_id int NOT NULL REFERENCES niche (niche_id),
freelancer_id int NOT NULL REFERENCES freelancer (freelancer_id),
PRIMARY KEY (freelancer_id, niche_id)
);
(Apply this for all your join tables).
The PRIMARY KEY (freelancer_id, niche_id) will create a UNIQUE index.
This way you cannot insert duplicate data (try the INSERTs above, the will be rejected, because the information is already there once. Adding another time will not add more information AND would make your query runtime much slower).
NON-unique index on the other part of the JOIN tables
With PRIMARY KEY (freelancer_id, niche_id), Postgres creates a unique index on these two attributes (columns).
Accessing or JOINing by freelancer_id is fast, because it's first in the index. Accessing or JOINing into freelancer_by_niche.niche_id will be slow (Full Table Scan on freelancer_by_niche).
Therefore you should create an INDEX on the second part niche_id in this table freelancer_by_niche, too.
CREATE INDEX ON freelancer_by_niche (niche_id) ;
Then joins into this table on niche_id will also be faster, because they are accelerated by an index. The index makes queries faster (usually).
Summary
You have a very good normalized database schema! It's very good. But small improvements can be made (see above).

Related

how Inner join work on two foreign key from single table

I am working on Bus route management system , I made two table first one is Cities and second one is route have following queries
CREATE TABLE Cities
(
ID NUMBER GENERATED ALWAYS AS IDENTITY(START with 1 INCREMENT by 1) PRIMARY KEY,
Name Varchar(30) not null,
)
CREATE TABLE route
(
ID NUMBER GENERATED ALWAYS AS IDENTITY(START with 1 INCREMENT by 1) PRIMARY KEY,
Name Varchar(30) not null,
from NUMBER not null,
to NUMBER NOT NULL,
CONSTRAINT FROM_id_FK FOREIGN KEY(from) REFERENCES Cities(ID),
CONSTRAINT TO_id_FK FOREIGN KEY(to) REFERENCES Cities(ID),
)
i am joining the table through inner join
select CITIES.Name
from CITIES
inner join ROUTES on CITIES.ID=ROUTES.ID
but it show single column as
Name
-----------
but i want result as
from | to
------------------------
what is possible way to do this using inner join
I suspect you need something like the following:
select r.Name, cs.Name SourceCity, cd.Name DestinationCity
from routes r
join cities cs on cs.id = r.from
join cities cd on cd.id = r.to
Hope is working for you
select CITIES.Name,ROUTES.from,ROUTES.to
from CITIES inner join ROUTES on CITIES.ID=ROUTES.ID

Select a product that is on all interventions

Hello my question is simple for some of yours ^^
I've a table product, reference, and intervention. When there is an intervention the table reference make the link between products that we need for the interventions and the intervention.
I would like to know how to do to search products that have made part of all interventions.
This are my tables :
--TABLE products
create table products (
reference char(5) not null check ( reference like 'DT___'),
designation char(50) not null,
price numeric (9,2) not null,
primary key(reference) );
-- TABLE interventions
create table interventions (
nointerv integer not null ,
dateinterv date not null,
nameresponsable char(30) not null,
nameinterv char(30) not null,
time float not null check ( temps !=0 AND temps between 0 and 8),
nocustomers integer not null ,
nofact integer not null ,
primary key( nointerv),
foreign key( noclient) references customers,
foreign key (nofacture) references facts
);
-- TABLE replacements
create table replacements (
reference char(5) not null check ( reference like 'DT%'),
nointerv integer not null,
qtereplaced smallint,
primary key ( reference, nointerv ),
foreign key (reference) references products,
foreign key(nointerv) references interventions(nointerv)
);
--EDIT :
This is a select from my replacement table
We can see in this picture that the product DT802 is used in every interventions
Thanks ;)
This will show 1 line intervention - products. Is this you are expecting for?
select interventions.nointerv, products.reference
from interventions
inner join replacements on interventions.nointerv = replacements.nointerv
inner join products on replacements.reference = products.reference;
This one?
select products.reference, products.designation
from interventions
inner join replacements on interventions.nointerv = replacements.nointerv
inner join products on replacements.reference = products.reference
group by products.reference, products.designation
having count(*) = (select count(*) from interventions);
Your question is hard to follow. If I interpret it as all nointerv in replacements whose reference contains all products, then:
select nointerv
from replacements r
group by nointerv
having count(distinct reference) = (select count(*) from products);

How to relate tables SQL

I have three tables and i want to relate them, but i don't know what im doing wrong. If the way that im thinking is bad, can you correct me also?
I have clients table with Primary key as ID_c column,
create table clients
(
id_c INTEGER not null,
name VARCHAR2(20),
age INTEGER,
address VARCHAR2(20),
Primary key (id_c)
);
also i have products with primary key as ID_p column.
create table PRODUCTS
(
id_p NUMBER not null,
name_product VARCHAR2(30),
price NUMBER,
duration NUMBER,
primary key (id_p)
);
and now i create third
create table TRANSACTIONS
(
id_t NUMBER not null,
id_c NUMBER not null,
id_p NUMBER not null
primary key (ID_t),
foreign key (ID_c) references CLIENTS (ID_c),
foreign key (ID_p) references PRODUCTS (ID_p)
);
and now i want to see all records that are connected, so im trying to use that:
select * from transactions join clients using (id_c) and join products using (id_p);
but only what works is
select * from transactions join clients using (id_c);
is it relational database or im making something too easy, and too primitive? How can i do that to connect everything?
try this
select *
from transactions
inner join clients on transactions.id_c = clients.id_c
inner join products on transactions.id_p = products.id_p;
Are you just trying to join?
select * from transactions a
join clients b on a.id_c = b.id_c
join products c on a.id_p = c.id_p
If you want to join 3 tables, just write:
SELECT * FROM TRANSACTIONS t JOIN client c on t.id_c = c.id_c JOIN PRODUCTS p on t.id_p = p.id_p

SQL Anomaly Using 'USING' Clause with Nested Queries?

I have a normalized database containing 3 tables whose DDL is this:
CREATE CACHED TABLE Clients (
cli_id INTEGER GENERATED ALWAYS AS IDENTITY (START WITH 100) PRIMARY KEY,
defmrn_id BIGINT,
lastName VARCHAR(48) DEFAULT '' NOT NULL,
midName VARCHAR(24) DEFAULT '' NOT NULL,
firstName VARCHAR(24) DEFAULT '' NOT NULL,
doB INTEGER DEFAULT 0 NOT NULL,
gender VARCHAR(1) NOT NULL);
CREATE TABLE Client_MRNs (
mrn_id BIGINT GENERATED ALWAYS AS IDENTITY (START WITH 100) PRIMARY KEY,
cli_id INTEGER REFERENCES Clients ( cli_id ),
inst_id INTEGER REFERENCES Institutions ( inst_id ),
mrn VARCHAR(32) DEFAULT '' NOT NULL,
CONSTRAINT climrn01 UNIQUE (mrn, inst_id));
CREATE TABLE Institutions (
inst_id INTEGER GENERATED ALWAYS AS IDENTITY (START WITH 100) PRIMARY KEY,
loc_id INTEGER REFERENCES Locales (loc_id ),
itag VARCHAR(6) UNIQUE NOT NULL,
iname VARCHAR(80) DEFAULT '' NOT NULL);
The first table contains a foreign key column, defmrn_id, that is a reference to a "default identifier code" that is stored in the second table (which is a list of all identifier codes). A record in the first table may have many identifiers, but only one default identifier. So yeah, I have created a circular reference.
The third table is just normalized data from the second table.
I wanted a query that would find a CLIENT record based on matching a supplied identifier code to any of the identifier codes in CLIENT_MRNs that may belong to that CLIENT record.
My strategy was to first identify those records that matched in the second table (CLIENT_MRN) and then use that intermediate result to join to records in the CLIENT table that matched other user-supplied searching criteria. I also need to denormalize the identifier reference defmrn_id in the 1st table. Here is what I came up with...
SQL = SELECT c.*, r.mrn, i.inst_id, i.itag, i.iname
FROM Clients AS c
INNER JOIN
(
SELECT m.cli_id
FROM Client_MRNs AS m
WHERE m.mrn = ?
) AS m2 ON m2.cli_id = c.cli_id
INNER JOIN Client_MRNs AS r ON c.defmrn_id = r.mrn_id
INNER JOIN Institutions AS i USING ( inst_id )
WHERE (<other user supplied search criteria...>);
The above works, but I spent some time trying to understand why the following was NOT working...
SQL = SELECT c.*, r.mrn, i.inst_id, i.itag, i.iname
FROM Clients AS c
INNER JOIN
(
SELECT m.cli_id
FROM Client_MRNs AS m
WHERE m.mrn = ?
) AS m2 USING ( cli_id )
INNER JOIN Client_MRNs AS r ON c.defmrn_id = r.mrn_id
INNER JOIN Institutions AS i USING ( inst_id )
WHERE (<other user supplied search criteria...>);
It seems to me that the second SQL should work, but it fails on the USING clause every time. I am executing these queries against a database managed by HSQLDB 2.2.9 as the RDBMS. Is this a parsing issue in HSQLDB or is this a known limitation of the USING clause with nested queries?
You can always try with HSQLDB 2.3.0 (a release candidate).
The way you report the incomplet SQL does not allow proper checking. But there is an ovbious mistake in the query. If you have:
SELECT INST_ID FROM CLIENTS_MRS AS R INNER JOIN INSTITUTIONS AS I USING (INST_ID)
INST_ID can be used in the SELECT column list only without a table qualifier. The reason is it is no longer considered a column of either table. The same is true with common columns if you use NATURAL JOIN.
This query is accepted by version 2.3.0
SELECT c.*, r.mrn, inst_id, i.itag, i.iname
FROM Clients AS c
INNER JOIN
(
SELECT m.cli_id
FROM Client_MRNs AS m
WHERE m.mrn = 2
) AS m2 USING ( cli_id )
INNER JOIN Client_MRNs AS r ON c.defmrn_id = r.mrn_id
INNER JOIN Institutions AS i USING ( inst_id )

A join retrieve null column

I have 2 tabels
create table Students(
SerialNumber int primary key identity,
Name varchar(50) not null,
Surname varchar(50) not null,
AcademicProgram int foreign key references AcademicProgrammes(Id)
)
Create table AcademicProgrammes(
Id int primary key identity,
Name varchar (20) not null
)
and I want to get from students table all the students, but instead the AcademicProgram reference foreign key I want the name of the AcademicProgrammes.
my join looks like this :
select Students.SerialNumber,Students.Name, Students.Surname, AcademicProgrammes.Name
from Students left join
AcademicProgrammes on Students.SerialNumber=AcademicProgrammes.Id
if i have 2 academic programs master's and undergraduate
as a result I get all ste students but as the academic program name column only the first 2 students have the name of de academic program, and the rest of them have null
Vasile Magdalena-Maria Licenta
Ciotmonda Oana-Maria Master
Rus Diana NULL
Turcu Gabriel NULL
I can't find what I'm doing wrong
Thanks !
I believe you need to join by
Students.AcademicProgram=AcademicProgrammes.Id
instead of
Students.SerialNumber=AcademicProgrammes.Id
Because of that you're getting names of academic programs for only students with serial numbers 1 and 2 (since you have only two programs).
Therefore try following
SELECT s.SerialNumber,
s.Name,
s.Surname,
a.Name AS Program
FROM Students s LEFT JOIN
AcademicProgrammes a ON s.AcademicProgram=a.Id