Constraints w/ Recursive Postgres Query - sql

I'm looking to skip a certain city as I traverse my data. Currently, this query works to find all available flights from SLC to LA, including trips with layovers. You'll see this in the picture below.
However, I want to be able to exclude certain cities in a flight plan. For example, if Montreal is a stop between SLC and LA, that trip wouldn't be considered.
I've tried putting various things in the WHERE clauses, but to no avail. Any other suggestions? Sample data an queries are given below.
WITH RECURSIVE segs AS (
SELECT f0.flight_num::text as flight
, src_city, dest_city
, dep_time AS departure
, arr_time AS arrival
, airfare, mileage
, 1 as hops
, (arr_time - dep_time)::interval AS total_time
, '00:00'::interval as waiting_time
FROM flight f0
WHERE src_city = 'SLC' -- <SRC_CITY>
UNION ALL
SELECT s.flight || '-->' || f1.flight_num::text as flight
, s.src_city, f1.dest_city
, s.departure AS departure
, f1.arr_time AS arrival
, s.airfare + f1.airfare as airfare
, s.mileage + f1.mileage as mileage
, s.hops + 1 AS hops
, s.total_time + (f1.arr_time - f1.dep_time)::interval AS total_time
, s.waiting_time + (f1.dep_time - s.arrival)::interval AS waiting_time
FROM segs s
JOIN flight f1
ON f1.src_city = s.dest_city
AND f1.dep_time > s.arrival -- you can't leave until you are there
)
SELECT *
FROM segs
WHERE dest_city = 'LA' -- <DEST_CITY>
ORDER BY airfare desc
;
create table flight
( flight_num BIGSERIAL PRIMARY KEY
, src_city varchar
, dest_city varchar
, dep_time TIME
, arr_time TIME
, airfare INTEGER
, mileage INTEGER
);
insert into flight VALUES
(101, 'Montreal', 'NY', '05:30', '06:45', 180, 170),
(102, 'Montreal', 'Washington', '01:00', '02:35', 100, 180),
(103, 'NY', 'Chicago', '08:00', '10:00', 150, 300),
(105, 'Washington', 'KansasCity', '06:00', '08:45', 200, 600),
(106, 'Washington', 'NY', '12:00', '13:30', 50, 80),
(107, 'Chicago', 'SLC', '11:00', '14:30', 220, 750),
(110, 'KansasCity', 'Denver', '14:00', '15:25', 180, 300),
(111, 'KansasCity', 'SLC', '13:00', '15:30', 200, 500),
(112, 'SLC', 'SanFran', '18:00', '19:30', 85, 210),
(113, 'SLC', 'LA', '17:30', '19:00', 185, 230),
(115, 'Denver', 'SLC', '15:00', '16:00', 75, 300),
(116, 'SanFran', 'LA', '22:00', '22:30', 50, 75),
(118, 'LA', 'Seattle', '20:00', '21:00', 150, 450);

To exclude certain cities from the flight plan you should add where clauses at 2 places in your query as following:
Right after src_city condition
...
WHERE src_city = 'SLC' -- <SRC_CITY>
AND dest_city <> 'Montreal'
...
In the recursive join condition
...
AND f1.dep_time > s.arrival -- you can't leave until you are there
AND f1.dest_city <> 'Montreal'
...
I don't have Postgress but I tried it with SQL server and it seems to work.

Related

Finding top 10 products sold in a year

I have these tables below along with the definition. I want to find top 10 products sold in a year after finding counts and without using aggregation and in an optimized way. I want to know if aggregation is still needed or I can accomplish it without using aggregation. Below is the query. Can anyone suggest a better approach.
CREATE TABLE Customer (
id int not null,
first_name VARCHAR(30),
last_name VARCHAR(30),
Address VARCHAR(60),
State VARCHAR(30),
Phone text,
PRIMARY KEY(id)
);
CREATE TABLE Product (
ProductId int not null,
name VARCHAR(30),
unitprice int,
BrandID int,
Brandname varchar(30),
color VARCHAR(30),
PRIMARY KEY(ProductId)
);
Create Table Sales (
SalesId int not null,
Date date,
Customerid int,
Productid int,
Purchaseamount int,
PRIMARY KEY(SalesId),
FOREIGN KEY (Productid) REFERENCES Product(ProductId),
FOREIGN KEY (Customerid) REFERENCES Customer(id)
)
Sample Data:
insert into
Customer(id, first_name, last_name, address, state, phone)
values
(1111, 'andy', 'johnson', '123 Maryland Heights', 'MO', 3211451234),
(1112, 'john', 'smith', '237 Jackson Heights', 'TX', 3671456534),
(1113, 'sandy', 'fleming', '878 Jersey Heights', 'NJ', 2121456534),
(1114, 'tony', 'anderson', '789 Harrison Heights', 'CA', 6101456534)
insert into
Product(ProductId, name, unitprice, BrandId, Brandname)
values
(1, 'watch',200, 100, 'apple'),
(2, 'ipad', 429, 100, 'apple'),
(3, 'iphone', 799, 100, 'apple'),
(4, 'gear', 300, 110, 'samsung'),
(5, 'phone',1000, 110, 'samsung'),
(6, 'tab', 250, 110, 'samsung'),
(7, 'laptop', 1300, 120, 'hp'),
(8, 'mouse', 10, 120, 'hp'),
(9, 'monitor', 400, 130, 'dell'),
(10, 'keyboard', 40, 130, 'dell'),
(11, 'dvddrive', 100, 130, 'dell'),
(12, 'dvddrive', 90, 150, 'lg')
insert into
Sales(SalesId, Date, CustomerID, ProductID, Purchaseamount)
values (30, '01-10-2019', 1111, 1, 200),
(31, '02-10-2019', 1111, 3, 799),
(32, '03-10-2019', 1111, 2, 429),
(33, '04-10-2019', 1111, 4, 300),
(34, '05-10-2019', 1111, 5, 1000),
(35, '06-10-2019', 1112, 7, 1300),
(36, '07-10-2019', 1112, 9, 400),
(37, '08-10-2019', 1113, 5, 2000),
(38, '09-10-2019', 1113, 4, 300),
(39, '10-10-2019', 1113, 3, 799),
(40, '11-10-2019', 1113, 2, 858),
(41, '01-10-2020', 1111, 1, 400),
(42, '02-10-2020', 1111, 2, 429),
(43, '03-10-2020', 1112, 7, 1300),
(44, '04-10-2020', 1113, 7, 2600),
(45, '05-10-2020', 1114, 7, 1300),
(46, '06-10-2020', 1114, 7, 1300),
(47, '07-10-2020', 1114, 9, 800)
Tried this:
SELECT PCY.Name, PCY.Year, PCY.SEQNUM
FROM (SELECT P.Name AS Name, Extract('Year' from S.Date) AS YEAR, COUNT(P.Productid) AS CNT,
RANK() OVER (PARTITION BY Extract('Year' from S.Date) ORDER BY COUNT(P.Productid) DESC) AS RANK
FROM Sales S inner JOIN
Product P
ON S.Productid = P.Productid
) PCY
WHERE PCY.RANK <= 10;
I am seeing this error:
ERROR: column "p.name" must appear in the GROUP BY clause or be used in an aggregate function
LINE 2: FROM (SELECT P.Name AS Name, Extract('Year' from S.Date) AS ...
^
SQL state: 42803
Character: 52
I don't understand why you don't want to use an aggregate function when you have to aggregate over your data. This query works fine, without any issues on the GROUP BY:
WITH stats AS (
SELECT EXTRACT
( YEAR FROM DATE ) AS y,
P.productid,
P.NAME,
COUNT ( * ) numbers_sold,
RANK ( ) OVER ( PARTITION BY EXTRACT ( YEAR FROM DATE ) ORDER BY COUNT ( * ) DESC ) r
FROM
product
P JOIN sales S ON S.Productid = P.Productid
GROUP BY
1,2
)
SELECT y
, name
, numbers_sold
FROM stats
WHERE r <= 10;
This works because the productid is the primary key that has a functional dependency to the product name.
By the way, tested on version 12, but it should work on older and newer versions as well.

Iterated sub sampling against distinct values, union results

I made a SQL fiddle here
I have a table that has for each row: a category, an document id and a ranking.
The categories are ranked within themselves. For each category, I would like to select a sub sample. All the sub samples should be stacked together in a table.
The catch is that I would like to sub sample by iteratively fetching a halved row index among that category, e.g. if a given category has 32 items, then I would like to fetch rows 32, 16, 8, 4, 2, 1.
In my SQL fiddle I was able to do this for one particular category but I can't figure out how to:
a) do it for all categories in [Major Focus Area]
b) union the resulting subsamples into one table
Any hints or help is much appreciated! I am working in TSQL (MS SQL Server)
Sample data (MS Sql):
CREATE TABLE Rank_MajorAreas
([Rank] int, [Major Focus Area] varchar(17), [ID] int)
;
INSERT INTO Rank_MajorAreas
([Rank], [Major Focus Area], [ID])
VALUES
(1, 'Welfare', 71366),
(2, 'Welfare', 70415),
(3, 'Truck Driving', 70423),
(4, 'Peasant''s Office', 74566),
(5, 'Peasant''s Office', 71560),
(6, 'Nail Therapy', 77497),
(7, 'Truck Driving', 76193),
(8, 'Truck Driving', 79226),
(9, 'Truck Driving', 70222),
(10, 'Welfare', 77336),
(11, 'Truck Driving', 70823),
(12, 'Welfare', 77096),
(13, 'Welfare', 71335),
(14, 'Nail Therapy', 73551),
(15, 'Welfare', 72146),
(16, 'Truck Driving', 74023),
(17, 'Welfare', 71546),
(18, 'Nail Therapy', 74755),
(19, 'Peasant''s Office', 77834),
(20, 'Welfare', 75667),
(21, 'Peasant''s Office', 71342),
(22, 'Peasant''s Office', 77457),
(23, 'Peasant''s Office', 77923),
(24, 'Welfare', 76508),
(25, 'Welfare', 75714),
(26, 'Welfare', 73654),
(27, 'Welfare', 75753),
(28, 'Truck Driving', 71481),
(29, 'Truck Driving', 79424),
(30, 'Peasant''s Office', 76143),
(31, 'Truck Driving', 74076),
(32, 'Nail Therapy', 78714),
(33, 'Nail Therapy', 79924),
(34, 'Welfare', 71482),
(35, 'Welfare', 70050),
(36, 'Welfare', 76053),
(37, 'Nail Therapy', 79591),
(38, 'Peasant''s Office', 75197),
(39, 'Nail Therapy', 74104),
(40, 'Welfare', 72891),
(41, 'Truck Driving', 73621),
(42, 'Peasant''s Office', 71713),
(43, 'Welfare', 71979),
(44, 'Peasant''s Office', 71601),
(45, 'Peasant''s Office', 73928),
(46, 'Nail Therapy', 71759),
(47, 'Nail Therapy', 70379),
(48, 'Welfare', 71215),
(49, 'Truck Driving', 70908),
(50, 'Welfare', 71989)
;
Code thus far:
CREATE VIEW MFA AS
SELECT ROW_NUMBER() OVER(ORDER BY fa.[Rank] ASC) AS Row
,*
FROM Rank_MajorAreas AS fa
-- ideally we could make a view per Focus Area
WHERE fa.[Major Focus Area] = 'Welfare'
ORDER BY Row ASC
OFFSET 0 ROWS;
DECLARE #start int
SELECT #start = (SELECT COUNT(*) FROM MFA)
;WITH Sample( Row ) AS
(
Select #start as Row
UNION ALL
SELECT ROUND(Row/2, 0)
FROM Sample
WHERE Row > 0
)
SELECT * FROM MFA AS mfa
INNER JOIN Sample AS s on s.Row = mfa.Row
ORDER BY mfa.Row ASC
Desired Results, where each focus area is subsampled, the subsamples are returned all together as a single result
Row Rank Major Focus Area ID
1 1 Welfare 71366
2 2 Welfare 70415
4 12 Welfare 77096
9 24 Welfare 76508
19 50 Welfare 71989
...
1 6 Nail Therapy 77497
2 14 Nail Therapy 73551
4 32 Nail Therapy 78714
9 47 Nail Therapy 7037
You need to use PARTITION BY on Major Focus Area column in the OVER clause. Following is the modified TSQL
CREATE VIEW MFA AS
SELECT ROW_NUMBER() OVER(PARTITION BY fa.[Major Focus Area] ORDER BY fa.[Rank] ASC) AS Row
,*
FROM Rank_MajorAreas AS fa
-- ideally we could make a view per Focus Area
ORDER BY [Major Focus Area], Row ASC
OFFSET 0 ROWS;
DECLARE #start int
SELECT #start = (SELECT COUNT(*) FROM MFA)
;WITH Sample( Row, fa ) AS
(
Select COUNT(*) as Row, [Major Focus Area] as fa FROM MFA GROUP BY [Major Focus Area]
UNION ALL
SELECT ROUND(Row/2, 0), fa
FROM Sample
WHERE Row > 0
)
SELECT mfa.Row, mfa.Rank, mfa.[Major Focus Area] FROM MFA AS mfa
INNER JOIN Sample AS s on s.Row = mfa.Row and s.fa=mfa.[Major Focus Area]
ORDER BY [Major Focus Area], mfa.Row ASC
SQL fiddle

How can I write correct query?

WITH Encashment AS (
SELECT T.MachineId, T.Amount, CAST(Occured AS DATETIME) AS Occured
FROM (VALUES
(1, 101, '2017-10-20 09:36:40.057')
,(1, 203, '2017-10-14 12:36:30.081')
,(1, 400, '2017-10-11 04:17:38.023')
) AS T(MachineId, Amount, Occured)
), MoneyAccepted AS (
SELECT T.MachineId, T.Amount, CAST(Occured AS DATETIME) AS Occured
FROM (VALUES
(1, 1, '2017-10-15 09:36:40.057')
,(1, 100, '2017-10-16 12:36:30.081')
,(1, 100, '2017-10-12 16:17:38.023')
,(1, 1, '2017-10-13 09:37:47.057')
,(1, 1, '2017-10-13 09:37:47.057')
,(1, 1, '2017-10-12 15:37:47.057')
,(1, 100, '2017-09-15 12:37:31.081')
,(1, 100, '2017-09-15 16:37:31.081')
,(1, 100, '2017-09-16 13:37:31.081')
,(1, 100, '2017-09-17 13:37:31.081')
) AS T(MachineId, Amount, Occured)
)
I can get Amount among two encashment.(Select Amount from Encashment).
But, I want to get amount from MoneyAccepted for every Encashment.
For example: Encashment happened in 20-10-2017,till this dateTime accepted 101(100(2017-10-16 12:36:30.081)+1(2017-10-15 09:36:40.057)) money.
How can I get that?
Thanks in advance!
I think what you are looking for is:
DECLARE #Encashment AS TABLE (MachineID INT, Amount INT, Occured DATETIME2)
DECLARE #MoneyAccepted AS TABLE (MachineID INT, Amount INT, Occured DATETIME2)
INSERT #Encashment (MachineID, Amount, Occured)
VALUES (1, 101, '20171020 09:36:40.057')
, (1, 203, '20171014 12:36:30.081')
, (1, 400, '20171011 04:17:38.023')
INSERT #MoneyAccepted (MachineID, Amount, Occured)
VALUES (1, 1, '20171015 09:36:40.057')
, (1, 100, '20171016 12:36:30.081')
, (1, 100, '20171012 16:17:38.023')
, (1, 100, '20171014 09:17:38.023')
, (1, 1, '20171013 09:37:47.057')
, (1, 1, '20171013 09:37:47.057')
, (1, 1, '20171012 15:37:31.081')
SELECT E.Occured AS Encashment_Occured
, SUM(MA.Amount) AS SUM_Amount
FROM #MoneyAccepted AS MA
INNER JOIN (
SELECT MachineID
, Amount
, Occured
, LAG(Occured) OVER(PARTITION BY MachineID ORDER BY Occured) AS Previous_Occured
FROM #Encashment
) AS E
ON E.MachineID = MA.MachineID
AND E.Occured > MA.Occured
AND E.Previous_Occured <= MA.Occured
GROUP BY E.Occured
Result:
+-----------------------------+------------+
| Encashment_Occured | SUM_Amount |
+-----------------------------+------------+
| 2017-10-14 12:36:30.0810000 | 203 |
| 2017-10-20 09:36:40.0570000 | 101 |
+-----------------------------+------------+
This uses LAG, which was introduced in sql server 2012, in order to get the range of applicable dates in a single row.
Please edit your question, remove html and use plain text for sample data.
I think you could use CROSS APPLY.
Try this:
WITH Encashment AS (
SELECT T.MachineId, T.Amount, CAST(Occured AS DATETIME) AS Occured
FROM (VALUES
(1, 101, '2017-10-20 09:36:40.057')
,(1, 203, '2017-10-14 12:36:30.081')
,(1, 400, '2017-10-11 04:17:38.023')
) AS T(MachineId, Amount, Occured)
), MoneyAccepted AS (
SELECT T.MachineId, T.Amount, CAST(Occured AS DATETIME) AS Occured
FROM (VALUES
(1, 1, '2017-10-15 09:36:40.057')
,(1, 100, '2017-10-16 12:36:30.081')
,(1, 100, '2017-10-12 16:17:38.023')
,(1, 1, '2017-10-13 09:37:47.057')
,(1, 1, '2017-10-13 09:37:47.057')
,(1, 1, '2017-10-12 15:37:47.057')
,(1, 100, '2017-09-15 12:37:31.081')
,(1, 100, '2017-09-15 16:37:31.081')
,(1, 100, '2017-09-16 13:37:31.081')
,(1, 100, '2017-09-17 13:37:31.081')
) AS T(MachineId, Amount, Occured)
)
SELECT M.*, EN.*
FROM MoneyAccepted AS M
CROSS APPLY (
SELECT TOP (1) E.* FROM Encashment AS E
WHERE E.MachineId = M.MachineId AND E.Occured > M.Occured
ORDER BY E.Occured ASC
) AS EN

Performing a subquery using values from a column in Oracle

I'm trying to create a calculated column in SQL. Basically I need to get a set of distinct dates and determine how many customers there are in the population on that particular date. The result should be something like:
Date______| Customers
2016-01-01 | 1
2016-01-01 | 2
2016-01-05 | 3
2016-02-09 | 4
etc.
I created a sample database & data (using MySQL as I don't have permission to create tables in our Oracle dbs) with the following script:
create database customer_example;
use customer_example;
create table customers (
customer_id int not null primary key,
customer_name varchar(255) not null,
term_date DATE);
create table employee (
employee_id int not null primary key,
employee_name varchar(255) not null);
create table cust_emp (
ce_id int not null AUTO_INCREMENT,
emp_id int not null,
cust_id int not null,
start_date date,
end_date date,
deleted_yn boolean,
primary key (emp_id, cust_id, ce_id),
foreign key (cust_id) references customers(customer_id),
foreign key (emp_id) references employee(employee_id));
insert into customers (customer_id, customer_name)
values (1, 'Bobby Tables'), (2, 'Grover Cleveland'), (3, 'Chester Arthur'), (4, 'Jan Bush'), (5, 'Emanuel Porter'), (6, 'Darren King'), (7, 'Casey Mcguire'), (8, 'Robin Simpson'), (9, 'Robin Tables'), (10, 'Mitchell Arnold');
insert into customers (customer_id, customer_name, term_date)
values (11, 'Terrell Graves', '2017-01-01'), (12, 'Richard Wagner', '2016-10-31'), (13, 'Glenn Saunders', '2016-11-19'), (14, 'Bruce Irvin', '2016-03-11'), (15, 'Glenn Perry','2016-06-06'), (16, 'Hazel Freeman', '2016-07-10'),
(17, 'Martin Freeman', '2016-02-11'), (18, 'Morgan Freeman', '2017-02-01'), (19, 'Dirk Drake', '2017-01-12'), (20, 'Fraud Fraud', '2016-12-31');
insert into employee (employee_id, employee_name)
values (1000, 'Cedrick French'), (1001, 'Jane Phillips'), (1002, 'Brian Green'), (1003, 'Shawn Brooks'), (1004, 'Clarence Thomas');
insert into cust_emp (emp_id, cust_id, start_date, end_date)
values (1000, 1, '2016-01-01', '2016-02-01'), (1000, 1, '2016-02-01', '2016-02-01'), (1000, 2,'2016-01-05', '2016-01-16'),(1000, 3,'2016-02-09', '2016-03-14'),(1000, 4,'2016-03-20', '2016-04-23'),
(1000, 5,'2016-01-01', '2016-01-16'),(1000, 6,'2016-01-01', '2016-01-16'),(1004, 7, '2016-01-14', '206-01-16'),
(1004, 8, '2016-01-13', '2016-01-16'),(1004, 9, '2016-01-05', '2016-01-16'), (1003, 12, '2016-04-21', '2016-11-30');
insert into cust_emp (emp_id, cust_id, start_date, deleted_yn)
values (1002, 11, '2016-04-10', TRUE),(1003, 10, '2016-01-16', FALSE), (1004, 12, '2016-04-20', TRUE), (1004, 12, '2016-04-19', FALSE), (1003, 13, '2016-06-06', TRUE), (1002, 14, '2016-06-10', TRUE),
(1004, 15, '2016-03-25', TRUE), (1004, 17, '2016-01-02', TRUE), (1004, 18, '2017-01-01', TRUE), (1004, 19, '2016-11-13', TRUE), (1004, 20, '2016-03-10', TRUE), (1004, 16, '2016-05-13', TRUE);
insert into cust_emp (emp_id, cust_id, start_date)
values (1002, 1, '2016-02-01'), (1004, 2, '2016-01-16'),(1003, 3, '2016-03-14'),(1002, 4, '2016-04-23'),(1004, 5, '2016-01-16'),(1002, 6, '2016-01-16'),(1004, 7, '2016-01-16'),
(1004, 8, '2016-01-16'),(1002, 9, '2016-01-16'), (1004, 10, '2016-01-16');
The following SQL works fine in MySQL but when I try it in Oracle, I get an 'invalid identifier' on 'dates':
select distinct(ce.start_date) as dates,
(select count(distinct(c.customer_id))
from customers c
inner join cust_emp ce on c.customer_id = ce.cust_id
where ce.start_date < dates
and (ce.end_date > dates or (ce.deleted_yn = false or ce.deleted_yn is null))
and (c.term_date > dates or c.term_date is null)
)
from cust_emp as ce;
It seems as though this is because the dates is too far in a subquery. I've tried a CTE as well, but that seems to have the same issue as it gave the same error. How can I re-write this so that I can assess how many customers were there for each date in Oracle?
Huh?
Isn't this what you want?
select ce.dates as dates, count(distinct c.customer_id)
from cust_emp ce join
customers c
on c.customer_id = ce.cust_id
where ce.start_date < ce.dates and
(ce.end_date > ce.dates or ce.deleted_yn = false or ce.deleted_yn is null) and
(c.term_date > ce.dates or c.term_date is null)
group by ce.dates
order by ce.dates;
I don't really understand the use of the subquery with select distinct. The logic you describe is more easily understood as a simple aggregation.
I'm not sure where dates comes from. It is not in your data model, but it is in your sample query.

Recursive/Hierarchical Query Using Postgres

The table: Flight (flight_num, src_city, dest_city, dep_time, arr_time, airfare, mileage)
I need to find the cheapest fare for unlimited stops from any given source city to any given destination city. The catch is that this can involve multiple flights, so for example if I'm flying from Montreal->KansasCity I can go from Montreal->Washington and then from Washington->KansasCity and so on. How would I go about generating this using a Postgres query?
Sample Data:
create table flight(
flight_num BIGSERIAL PRIMARY KEY,
source_city varchar,
dest_city varchar,
dep_time int,
arr_time int,
airfare int,
mileage int
);
insert into flight VALUES
(101, 'Montreal', 'NY', 0530, 0645, 180, 170),
(102, 'Montreal', 'Washington', 0100, 0235, 100, 180),
(103, 'NY', 'Chicago', 0800, 1000, 150, 300),
(105, 'Washington', 'KansasCity', 0600, 0845, 200, 600),
(106, 'Washington', 'NY', 1200, 1330, 50, 80),
(107, 'Chicago', 'SLC', 1100, 1430, 220, 750),
(110, 'KansasCity', 'Denver', 1400, 1525, 180, 300),
(111, 'KansasCity', 'SLC', 1300, 1530, 200, 500),
(112, 'SLC', 'SanFran', 1800, 1930, 85, 210),
(113, 'SLC', 'LA', 1730, 1900, 185, 230),
(115, 'Denver', 'SLC', 1500, 1600, 75, 300),
(116, 'SanFran', 'LA', 2200, 2230, 50, 75),
(118, 'LA', 'Seattle', 2000, 2100, 150, 450);
[this answer is based on Gordon's]
I changed arr_time and dep_time to TIME datatypes, which makes calculations easier.
Also added result columns for total_time and waiting_time. Note: if there are any loops possible in the graph, you will need to avoid them (possibly using an array to store the path)
WITH RECURSIVE segs AS (
SELECT f0.flight_num::text as flight
, src_city, dest_city
, dep_time AS departure
, arr_time AS arrival
, airfare, mileage
, 1 as hops
, (arr_time - dep_time)::interval AS total_time
, '00:00'::interval as waiting_time
FROM flight f0
WHERE src_city = 'SLC' -- <SRC_CITY>
UNION ALL
SELECT s.flight || '-->' || f1.flight_num::text as flight
, s.src_city, f1.dest_city
, s.departure AS departure
, f1.arr_time AS arrival
, s.airfare + f1.airfare as airfare
, s.mileage + f1.mileage as mileage
, s.hops + 1 AS hops
, s.total_time + (f1.arr_time - f1.dep_time)::interval AS total_time
, s.waiting_time + (f1.dep_time - s.arrival)::interval AS waiting_time
FROM segs s
JOIN flight f1
ON f1.src_city = s.dest_city
AND f1.dep_time > s.arrival -- you can't leave until you are there
)
SELECT *
FROM segs
WHERE dest_city = 'LA' -- <DEST_CITY>
ORDER BY airfare desc
;
FYI: the changes to the table structure:
create table flight
( flight_num BIGSERIAL PRIMARY KEY
, src_city varchar
, dest_city varchar
, dep_time TIME
, arr_time TIME
, airfare INTEGER
, mileage INTEGER
);
And to the data:
insert into flight VALUES
(101, 'Montreal', 'NY', '05:30', '06:45', 180, 170),
(102, 'Montreal', 'Washington', '01:00', '02:35', 100, 180),
(103, 'NY', 'Chicago', '08:00', '10:00', 150, 300),
(105, 'Washington', 'KansasCity', '06:00', '08:45', 200, 600),
(106, 'Washington', 'NY', '12:00', '13:30', 50, 80),
(107, 'Chicago', 'SLC', '11:00', '14:30', 220, 750),
(110, 'KansasCity', 'Denver', '14:00', '15:25', 180, 300),
(111, 'KansasCity', 'SLC', '13:00', '15:30', 200, 500),
(112, 'SLC', 'SanFran', '18:00', '19:30', 85, 210),
(113, 'SLC', 'LA', '17:30', '19:00', 185, 230),
(115, 'Denver', 'SLC', '15:00', '16:00', 75, 300),
(116, 'SanFran', 'LA', '22:00', '22:30', 50, 75),
(118, 'LA', 'Seattle', '20:00', '21:00', 150, 450);
You want to use a recursive CTE for this. However, you will have to make a decision about how many flights to include. The following (untested) query shows how to do this, limiting the number of flight segments to 5:
with recursive segs as (
select cast(f.flight_num as varchar(255)) as flight, src_city, dest_city, dept_time,
arr_time, airfare, mileage, 1 as numsegs
from flight f
where src_city = <SRC_CITY>
union all
select cast(s.flight||'-->'||cast(f.flight_num as varchar(255)) as varchar(255)) as flight, s.src_city, f.dest_city,
s.dept_time, f.arr_time, s.airfare + f.airfare as airfare,
s.mileage + f.mileage as milage, s.numsegs + 1
from segs s join
flight f
on s.src_city = f.dest_city
where s.numsegs < 5
)
select *
from segs
where dest_city = <DEST_CITY>
order by airfare desc
limit 1;
Something like this:
select * from
(select flight_num, airfare from flight where src_city = ? and dest_city = ?
union
select f1.flight_num || f2.flight_num, f1.airfare+f2.airfare
from flight f1, flight f2 where f1.src_city = ? and f2.dest_city = ? and f1.dest_city = f2.src_city
union
...
) s order by airfare desc
I didn't test that as I'm leaving that for you so there might be subtle problems that require testing. This is clearly homework since no airline plans things this way. So I don't mind leaving you extra work.