PostgreSQL questions, constraints and queries - sql

My task is to make a table that records the placement won by race car drivers competing in Race events.
The given schema is:
CREATE TABLE RaceEvent
(
Name text,
Year int,
);
CREATE TABLE Driver
(
Name text,
Date_of_birth date,
Gender char,
Nationality,
);
I then added the following constraints :
CREATE TABLE RaceEvent
(
RaceName text NOT NULL PRIMARY KEY,
Year int NOT NULL,
Description text NOT NULL
);
CREATE TABLE Driver
(
Name text NOT NULL,
Date_of_birth date NOT NULL PRIMARY KEY,
Gender char(1) NOT NULL,
Nationality text NOT NULL
);
The table I created looks like this :
CREATE TABLE Races
(
Medal char(6) CHECK (Medal = 'Gold' or Medal = 'Silver' or Medal =
'Bronze'),
Event text NOT NULL REFERENCES RaceEvent (Name),
DriverDOB date NOT NULL REFERENCES Driver (Date_of_birth)
);
I know using the date of birth as a primary key is very silly but for some reason that was part of the task.
I need to ensure a driver cannot gain multiple medals in the same race, can anybody give insight on a good way of doing this? I thought about using some sort of check but can't quite work it out.
After that, I need to write a query that can return the nationalities of drivers that won at least 2 gold medals in certain years, to figure out which nationalities seem to produce the best drivers. 2 versions of the same query, one using aggregation and one not.
I know I have to do something along these lines :
SELECT Nationality from Driver JOIN Races ON Driver.Date_of_Birth = Races.DriverDOB WHERE ....?
Not sure on what the best way of figuring out how to link the nationalities to the medals?
All feedback much appreciated

The "best" way to do it would be to restructure your schema, as right now it's pretty crap. I'm assuming you can't, so here's one way to prevent multiple drivers from gaining multiple medals in the same race: add a primary key on DriverDOB and Event to the Races table.
Try it out here: http://sqlfiddle.com/#!17/dc8a9/1
As for the query to get the nationalities with multiple golds in a given year, here's one way to do it:
SELECT d.nationality, COUNT(*) AS golds
FROM races r
JOIN driver d
ON r.driverdob = d.date_of_birth
JOIN raceevent e
ON r.event = e.racename
AND e.year = 1999
WHERE r.medal = 'Gold'
GROUP BY d.nationality
HAVING COUNT(*) > 1;
Output:
nationality golds
NatA 3
NatB 2
And you can test it here: http://sqlfiddle.com/#!17/dc8a9/9

Related

PostgreSQL Insert into table with subquery selecting from multiple other tables

I am learning SQL (postgres) and am trying to insert a record into a table that references records from two other tables, as foreign keys.
Below is the syntax I am using for creating the tables and records:
-- Create a person table + insert single row
CREATE TABLE person (
pname VARCHAR(255) NOT NULL,
PRIMARY KEY (pname)
);
INSERT INTO person VALUES ('personOne');
-- Create a city table + insert single row
CREATE TABLE city (
cname VARCHAR(255) NOT NULL,
PRIMARY KEY (cname)
);
INSERT INTO city VALUES ('cityOne');
-- Create a employee table w/ForeignKey reference
CREATE TABLE employee (
ename VARCHAR(255) REFERENCES person(pname) NOT NULL,
ecity VARCHAR(255) REFERENCES city(cname) NOT NULL,
PRIMARY KEY(ename, ecity)
);
-- create employee entry referencing existing records
INSERT INTO employee VALUES(
SELECT pname FROM person
WHERE pname='personOne' AND <-- ISSUE
SELECT cname FROM city
WHERE cname='cityOne
);
Notice in the last block of code, where I'm doing an INSERT into the employee table, I don't know how to string together multiple SELECT sub-queries to get both the existing records from the person and city table such that I can create a new employee entry with attributes as such:
ename='personOne'
ecity='cityOne'
The textbook I have for class doesn't dive into sub-queries like this and I can't find any examples similar enough to mine such that I can understand how to adapt them for this use case.
Insight will be much appreciated.
There doesn’t appear to be any obvious relationship between city and person which will make your life hard
The general pattern for turning a select that has two base tables giving info, into an insert is:
INSERT INTO table(column,list,here)
SELECT column,list,here
FROM
a
JOIN b ON a.x = b.y
In your case there isn’t really anything to join on because your one-column tables have no column in common. Provide eg a cityname in Person (because it seems more likely that one city has many person) then you can do
INSERT INTO employee(personname,cityname)
SELECT p.pname, c.cname
FROM
person p
JOIN city c ON p.cityname = c.cname
But even then, the tables are related between themselves and don’t need the third table so it’s perhaps something of an academic exercise only, not something you’d do in the real world
If you just want to mix every person with every city you can do:
INSERT INTO employee(personname,cityname)
SELECT pname, cname
FROM
person p
CROSS JOIN city c
But be warned, two people and two cities will cause 4 rows to be inserted, and so on (20 people and 40 cities, 800 rows. Fairly useless imho)
However, I trust that the general pattern shown first will suffice for your learning; write a SELECT that shows the data you want to insert, then simply write INSERT INTO table(columns) above it. The number of columns inserted to must match the number of columns selected. Don’t forget that you can select fixed values if no column from the query has the info (INSERT INTO X(p,c,age) SELECT personname, cityname, 23 FROM ...)
The following will work for you:
INSERT INTO employee
SELECT pname, cname FROM person, city
WHERE pname='personOne' AND cname='cityOne';
This is a cross join producing a cartesian product of the two tables (since there is nothing to link the two). It reads slightly oddly, given that you could just as easily have inserted the values directly. But I assume this is because it is a learning exercise.
Please note that there is a typo in your create employee. You are missing a comma before the primary key.

creating the sql query for company supervisors

I have four tables
create table emp (emp_ss int, emp_name nvarchar(20));
create table comp(comp_name nvarchar(20), comp_address nvarchar(20));
create table works (emp_ss int, comp_name nvarchar(20));
create table supervises (spv_ss int, emp_ss int );
Here SUPRVISER_SS and EMP_SS are subset of SS. Now I have to find:
the name of all the companies who have more than 4 supervisors
I have made a query for the above problem but not sure whether it is correct or not
SELECT COMP_NAME , COUNT(EMP_SS) FROM WORKS
WHERE EMP_SS IN (SELECT DISTINCT SPV_SS FROM supervises)
GROUP BY COMP_NAME
HAVING COUNT(EMP_SS) > 4;
the name of supervisors who have the largest number of employees
but unable to get the required result of the above condition
SELECT SPV_SS, COUNT(*) max_ FROM supervises GROUP BY SPV_SS
You don't need to have a seperate table for supervisors unless they come with extra information that doesn't belong in the employee table, just add an extra field (foreign key) in Employee table that links to the primary key in the same table.
First question: select company just use a group by companyid clause and then check if the count of supervisors is larger than 4 for.
Second question: select count(empid) and supervisor, use group by supervisor clause and add order by clause on the count column
I explained the logic, as for the actual sql code, you're gonna have to figure that out yourself.

DB: How to set up a many to many table(s) to handle multiple selectable conditions

I am working on a search filter for a website that will help users find a venue(for get-togethers and ceremonies) that meets their needs. Filters would include such things as: style, amenities, event type, etc. Multiple options in a category can apply to a venue, so a user can select multiple options from style, amenities and event type categories when searching.
My issue is in how I should approach the table design in the database. Currently I have a Venue table with a unique id and basic information, and a number of tables representing each category (style, amenities, etc) where they contain an id and name field.
I know that I need an intermediary table to hold foreign keys, so each option applicable to a category is associated to the venue.
Option 1: Create for each category table a many to many intermediary table with foreign keys to that category and the venue.
Option 2: Create one large intermediary table with foreign keys for every category, as well as the Venue
i.e.
fk_venue
fk_style
fk_amenities
...
I am trying to decide what is more efficient and less of a problem in coding for. Option 1 would require a query to each table which may become complicated to work with, where as option 2 seems easier to query but might have a much larger number of records to handle a venue with many amenities AND event types for example.
This doesn't seem like a new problem but I have had trouble finding resources that detail how best to approach this. We are currently using MSSQL for the DB and are building the site using .net core.
Go with option one. Create a join table to record the many-to-many relationships of each available feature of a venue. Option 2 is very wasteful in terms of storage. Consider a case where you have a venue with only one amenity, when 50 amenities types are available. Also, as I understand what you are suggesting for option 2, you would have to update your database design each time you add an amenity, event_type, or style. That would be a very difficult thing support wise.
In the case of Option 1, some of the tables would be:
Table Name: venue_amenities
Columns: venue_id, amenity_id
Table Name: venue_event_types
Columns: venue_id, event_type_id
Table Name: venue_styles
Columns: venue_id, style_id
When you query everything with a filter, you could query it like:
select distinct
v.venue_id
from venues v
inner join venue_amenities va on v.venue_id = va.venue_id
inner join venue_event_types vet on v.venue_id = vet.venue_id
inner join venue_styles vs on v.venue_id = vs.venue_id
where va.amenity_id in ([selected amenities])
and vet.event_type_id in ([selected event types])
and vs.venue_style in ([selected styles])
Option 3: You could start out with a meta data design. This would allow you to have multiple records per item or entity.
Often these things evolve with the development of tasks, or the evolution of the process and learning the data or the customer understanding some of the finer details that are drawn out as time goes on.
I've seen similar things where people design for hashtags or white lists, searching for that might get you closer to what you are looking for. Here is a working example to get you started.
declare #venue as table(
VenueID int identity(1,1) not null primary key clustered
, Name_ nvarchar(255) not null
, Address_ nvarchar(255) null
);
declare #venueType as table (
VenueTypeID int identity(1,1) not null primary key clustered
, VenueType nvarchar(255) not null
);
declare #venueStuff as table (
VenueStuffID int identity(1,1) not null primary key clustered
, VenueID int not null -- constraint back to venueid
, VenueTypeID int not null -- constraint to dim or lookup table for ... attribute types
, AttributeValue nvarchar(255) not null
);
insert into #venue (Name_)
select 'Bob''s Funhouse'
insert into #venueStuff (VenueID, VenueTypeID, AttributeValue)
select 1, 1, 'Scarrrrry' union all
select 1, 2, 'Food Avaliable' union all
select 1, 3, 'Game tables provided' union all
select 1, 4, 'Creepy';
insert into #venueType (VenueType)
select 'Haunted House Theme' union all
select 'Gaming' union all
select 'Concessions' union all
select 'post apocalyptic';
select a.Name_
, b.AttributeValue
, c.VenueType
from #venue a
join #venueStuff b
on a.VenueID = b.VenueID
join #venueType c
on c.VenueTypeID = b.VenueTypeID

Oracle SQL: Multiple lines of output per student

I am working on a SQL statement for a vendor who's product's import feature calls for separate lines of data per student who are involved in special programs. For example:
Student ID Program Name
12345 Special Education
12345 Title 1
12345 Limited English
67891 Special Education
67891 Gifted and Talented
I'm not sure how to write the query statement to give me a separate line of data per student for each program they are involved with, instead of a single line with multiple columns. Can anyone get me pointed in the right direction?
My table structure is as follows
Table: Students
Relevant Columns:
student_number FLOAT(126)
last_name VARCHAR2(50 CHAR)
first_name VARCHAR2(50 CHAR)
iep NUMBER(10)
ellstatus NUMBER(10)
gifted NUMBER(10)
title1 NUMBER(10)
(plus hundreds of other non-relevant fields)
Thank you.
Without seeing the table structures, I am guessing your have the student info and the program info in separate tables. So you will do something like this:
SELECT s.StudentId, p.ProgramName
FROM students s
INNER JOIN programs p
ON s.studentid = p.studentid
I guess that you have the program information in each of the different columns (iep, ellstatus, gifted, title1). If this is the case; then it is not a normalized database, and it will probably give you some trouble later on.
As I don't know exactly how you map the number values of the iep, title1, gifted, ellstatus to the programs, I'll give you a way to select the numbers, giving one row for each student/field relationship. You can add the formatting to the query to display the program names as you expect.
This is using the union operator. This operator unites the result sets of two different queries. If you don't use union all, then the repeated rows will be displayed once only. I added all because I guess the numbers might be repeated.
select student_id, program from (
select student_id, iep program from students where iep is not null
union all
select student_id, title1 program from students where title1 is not null
union all
select student_id, gifted program from students where gifted is not null
union all
select student_id, ellstatus program from students where ellstatus is not null
);
To read more about the union operator, you can go here: http://docs.oracle.com/cd/B28359_01/server.111/b28286/queries004.htm

What's the best way to store (and access) historical 1:M relationships in a relational database?

Hypothetical example:
I have Cars and Owners. Each Car belongs to one (and only one) Owner at a given time, but ownership may be transferred. Owners may, at any time, own zero or more cars. What I want is to store the historical relationships in a MySQL database such that, given an arbitrary time, I can look up the current assignment of Cars to Owners.
I.e. At time X (where X can be now or anytime in the past):
Who owns car Y?
Which cars (if any) does owner Z own?
Creating an M:N table in SQL (with a timestamp) is simple enough, but I'd like to avoid a correlated sub-query as this table will get large (and, hence, performance will suffer). Any ideas? I have a feeling that there's a way to do this by JOINing such a table with itself, but I'm not terribly experienced with databases.
UPDATE: I would like to avoid using both a "start_date" and "end_date" field per row as this would necessitate a (potentially) expensive look-up each time a new row is inserted. (Also, it's redundant).
Make a third table called CarOwners with a field for carid, ownerid and start_date and end_date.
When a car is bought fill in the first three and check the table to make sure no one else is listed as the owner. If there is then update the record with that data as the end_date.
To find current owner:
select carid, ownerid from CarOwner where end_date is null
To find owner at a point in time:
select carid, ownerid from CarOwner where start_date < getdate()
and end_date > getdate()
getdate() is MS SQL Server specific, but every database has some function that returns the current date - just substitute.
Of course if you also want additional info from the other tables, you would join to them as well.
select co.carid, co.ownerid, o.owner_name, c.make, c.Model, c.year
from CarOwner co
JOIN Car c on co.carid = c.carid
JOIN Owner o on o.ownerid = co.ownerid
where co.end_date is null
I've found that the best way to handle this sort of requirement is to just maintain a log of VehicleEvents, one of which would be ChangeOwner. In practice, you can derive the answers to all the questions posed here - at least as accurately as you are collecting the events.
Each record would have a timestamp indicating when the event occurred.
One benefit of doing it this way is that the minimum amount of data can be added in each event, but the information about the Vehicle can accumulate and evolve.
Also, with the timestamp, events can be added after the fact (as long as the timestamp accurately reflects when the event occurred.
Trying to maintain historical state for something like this in any other way I've tried leads to madness. (Maybe I'm still recovering. :D)
BTW, the distinguishing characteristic here is probably that it's a Time Series or Event Log, not that it's 1:m.
Given your business rule that each car belongs to at least one owner (ie. owners exist before they are assigned to a a car) and your operational constraint that the table may grow large, I'd design the schema as follows:
(generic sql 92 syntax:)
CREATE TABLE Cars
(
CarID integer not null default autoincrement,
OwnerID integer not null,
CarDescription varchar(100) not null,
CreatedOn timestamp not null default current timestamp,
Primary key (CarID),
FOREIGN KEY (OwnerID ) REFERENCES Owners(OwnerID )
)
CREATE TABLE Owners
(
OwnerID integer not null default autoincrement,
OwnerName varchar(100) not null,
Primary key(OwnerID )
)
CREATE TABLE HistoricalCarOwners
(
CarID integer not null,
OwnerID integer not null,
OwnedFrom timestamp null,
Owneduntil timestamp null,
primary key (cardid, ownerid),
FOREIGN KEY (OwnerID ) REFERENCES Owners(OwnerID ),
FOREIGN KEY (CarID ) REFERENCES Cars(CarID )
)
I personally would not touch the third table from my client application but would simply let the database do the work - and maintain data integrity - with ON UPDATE AND ON DELETE triggers on the Cars table to populate the HistoricalCarOwners table whenever a car changes owners (i.e whenever an UPDATE is committed on the OwnerId column) or a car is deleted.
With the above schema, selecting the current car owner is trivial and selecting historical car owners is a simple as
select ownerid, ownername from owners o inner join historicalcarowners hco
on hco.ownerid = o.ownerid
where hco.carid = :arg_id and
:arg_timestamp between ownedfrom and owneduntil
order by ...
HTH, Vince
If you really do not want to have a start and end date you can use just a single date and do a query like the following.
SELECT * FROM CarOwner co
WHERE co.CarId = #CarId
AND co.TransferDate <= #AsOfDate
AND NOT EXISTS (SELECT * FROM CarOwner co2
WHERE co2.CarId = #CarId
AND co2.TransferDate <= #AsOfDate
AND co2.TransferDate > co.Transferdate)
or a slight variation
SELECT * FROM Car ca
JOIN CarOwner co ON ca.Id = co.CarId
AND co.TransferDate = (SELECT MAX(TransferDate)
FROM CarOwner WHERE CarId = #CarId
AND TransferDate < #AsOfDate)
WHERE co.CarId = #CarId
These solution are functionally equivalent to Javier's suggestion but depending on the database you are using one solution may be faster than the other.
However, depending on your read versus write ratio you may find the performance better if you redundantly update the end date in the associative entity.
Why not have a transaction table? Which would contain the car ID, the FROM owner, the TO owner and the date the transaction occcured.
Then all you do is find the first transaction for a car before the desired date.
To find cars owned by Owner 253 on March 1st:
SELECT * FROM transactions WHERE ownerToId = 253 AND date > '2009-03-01'
cars table can have an id called ownerID, YOu can then simply
1.select car from cars inner join owners on car.ownerid=owner.ownerid where ownerid=y
2.select car from cars where owner=z
Not the exact syntax but simple pseudo code.