Recursive/Hierarchical Query Using Postgres - sql

The table: Flight (flight_num, src_city, dest_city, dep_time, arr_time, airfare, mileage)
I need to find the cheapest fare for unlimited stops from any given source city to any given destination city. The catch is that this can involve multiple flights, so for example if I'm flying from Montreal->KansasCity I can go from Montreal->Washington and then from Washington->KansasCity and so on. How would I go about generating this using a Postgres query?
Sample Data:
create table flight(
flight_num BIGSERIAL PRIMARY KEY,
source_city varchar,
dest_city varchar,
dep_time int,
arr_time int,
airfare int,
mileage int
);
insert into flight VALUES
(101, 'Montreal', 'NY', 0530, 0645, 180, 170),
(102, 'Montreal', 'Washington', 0100, 0235, 100, 180),
(103, 'NY', 'Chicago', 0800, 1000, 150, 300),
(105, 'Washington', 'KansasCity', 0600, 0845, 200, 600),
(106, 'Washington', 'NY', 1200, 1330, 50, 80),
(107, 'Chicago', 'SLC', 1100, 1430, 220, 750),
(110, 'KansasCity', 'Denver', 1400, 1525, 180, 300),
(111, 'KansasCity', 'SLC', 1300, 1530, 200, 500),
(112, 'SLC', 'SanFran', 1800, 1930, 85, 210),
(113, 'SLC', 'LA', 1730, 1900, 185, 230),
(115, 'Denver', 'SLC', 1500, 1600, 75, 300),
(116, 'SanFran', 'LA', 2200, 2230, 50, 75),
(118, 'LA', 'Seattle', 2000, 2100, 150, 450);

[this answer is based on Gordon's]
I changed arr_time and dep_time to TIME datatypes, which makes calculations easier.
Also added result columns for total_time and waiting_time. Note: if there are any loops possible in the graph, you will need to avoid them (possibly using an array to store the path)
WITH RECURSIVE segs AS (
SELECT f0.flight_num::text as flight
, src_city, dest_city
, dep_time AS departure
, arr_time AS arrival
, airfare, mileage
, 1 as hops
, (arr_time - dep_time)::interval AS total_time
, '00:00'::interval as waiting_time
FROM flight f0
WHERE src_city = 'SLC' -- <SRC_CITY>
UNION ALL
SELECT s.flight || '-->' || f1.flight_num::text as flight
, s.src_city, f1.dest_city
, s.departure AS departure
, f1.arr_time AS arrival
, s.airfare + f1.airfare as airfare
, s.mileage + f1.mileage as mileage
, s.hops + 1 AS hops
, s.total_time + (f1.arr_time - f1.dep_time)::interval AS total_time
, s.waiting_time + (f1.dep_time - s.arrival)::interval AS waiting_time
FROM segs s
JOIN flight f1
ON f1.src_city = s.dest_city
AND f1.dep_time > s.arrival -- you can't leave until you are there
)
SELECT *
FROM segs
WHERE dest_city = 'LA' -- <DEST_CITY>
ORDER BY airfare desc
;
FYI: the changes to the table structure:
create table flight
( flight_num BIGSERIAL PRIMARY KEY
, src_city varchar
, dest_city varchar
, dep_time TIME
, arr_time TIME
, airfare INTEGER
, mileage INTEGER
);
And to the data:
insert into flight VALUES
(101, 'Montreal', 'NY', '05:30', '06:45', 180, 170),
(102, 'Montreal', 'Washington', '01:00', '02:35', 100, 180),
(103, 'NY', 'Chicago', '08:00', '10:00', 150, 300),
(105, 'Washington', 'KansasCity', '06:00', '08:45', 200, 600),
(106, 'Washington', 'NY', '12:00', '13:30', 50, 80),
(107, 'Chicago', 'SLC', '11:00', '14:30', 220, 750),
(110, 'KansasCity', 'Denver', '14:00', '15:25', 180, 300),
(111, 'KansasCity', 'SLC', '13:00', '15:30', 200, 500),
(112, 'SLC', 'SanFran', '18:00', '19:30', 85, 210),
(113, 'SLC', 'LA', '17:30', '19:00', 185, 230),
(115, 'Denver', 'SLC', '15:00', '16:00', 75, 300),
(116, 'SanFran', 'LA', '22:00', '22:30', 50, 75),
(118, 'LA', 'Seattle', '20:00', '21:00', 150, 450);

You want to use a recursive CTE for this. However, you will have to make a decision about how many flights to include. The following (untested) query shows how to do this, limiting the number of flight segments to 5:
with recursive segs as (
select cast(f.flight_num as varchar(255)) as flight, src_city, dest_city, dept_time,
arr_time, airfare, mileage, 1 as numsegs
from flight f
where src_city = <SRC_CITY>
union all
select cast(s.flight||'-->'||cast(f.flight_num as varchar(255)) as varchar(255)) as flight, s.src_city, f.dest_city,
s.dept_time, f.arr_time, s.airfare + f.airfare as airfare,
s.mileage + f.mileage as milage, s.numsegs + 1
from segs s join
flight f
on s.src_city = f.dest_city
where s.numsegs < 5
)
select *
from segs
where dest_city = <DEST_CITY>
order by airfare desc
limit 1;

Something like this:
select * from
(select flight_num, airfare from flight where src_city = ? and dest_city = ?
union
select f1.flight_num || f2.flight_num, f1.airfare+f2.airfare
from flight f1, flight f2 where f1.src_city = ? and f2.dest_city = ? and f1.dest_city = f2.src_city
union
...
) s order by airfare desc
I didn't test that as I'm leaving that for you so there might be subtle problems that require testing. This is clearly homework since no airline plans things this way. So I don't mind leaving you extra work.

Related

Finding top 10 products sold in a year

I have these tables below along with the definition. I want to find top 10 products sold in a year after finding counts and without using aggregation and in an optimized way. I want to know if aggregation is still needed or I can accomplish it without using aggregation. Below is the query. Can anyone suggest a better approach.
CREATE TABLE Customer (
id int not null,
first_name VARCHAR(30),
last_name VARCHAR(30),
Address VARCHAR(60),
State VARCHAR(30),
Phone text,
PRIMARY KEY(id)
);
CREATE TABLE Product (
ProductId int not null,
name VARCHAR(30),
unitprice int,
BrandID int,
Brandname varchar(30),
color VARCHAR(30),
PRIMARY KEY(ProductId)
);
Create Table Sales (
SalesId int not null,
Date date,
Customerid int,
Productid int,
Purchaseamount int,
PRIMARY KEY(SalesId),
FOREIGN KEY (Productid) REFERENCES Product(ProductId),
FOREIGN KEY (Customerid) REFERENCES Customer(id)
)
Sample Data:
insert into
Customer(id, first_name, last_name, address, state, phone)
values
(1111, 'andy', 'johnson', '123 Maryland Heights', 'MO', 3211451234),
(1112, 'john', 'smith', '237 Jackson Heights', 'TX', 3671456534),
(1113, 'sandy', 'fleming', '878 Jersey Heights', 'NJ', 2121456534),
(1114, 'tony', 'anderson', '789 Harrison Heights', 'CA', 6101456534)
insert into
Product(ProductId, name, unitprice, BrandId, Brandname)
values
(1, 'watch',200, 100, 'apple'),
(2, 'ipad', 429, 100, 'apple'),
(3, 'iphone', 799, 100, 'apple'),
(4, 'gear', 300, 110, 'samsung'),
(5, 'phone',1000, 110, 'samsung'),
(6, 'tab', 250, 110, 'samsung'),
(7, 'laptop', 1300, 120, 'hp'),
(8, 'mouse', 10, 120, 'hp'),
(9, 'monitor', 400, 130, 'dell'),
(10, 'keyboard', 40, 130, 'dell'),
(11, 'dvddrive', 100, 130, 'dell'),
(12, 'dvddrive', 90, 150, 'lg')
insert into
Sales(SalesId, Date, CustomerID, ProductID, Purchaseamount)
values (30, '01-10-2019', 1111, 1, 200),
(31, '02-10-2019', 1111, 3, 799),
(32, '03-10-2019', 1111, 2, 429),
(33, '04-10-2019', 1111, 4, 300),
(34, '05-10-2019', 1111, 5, 1000),
(35, '06-10-2019', 1112, 7, 1300),
(36, '07-10-2019', 1112, 9, 400),
(37, '08-10-2019', 1113, 5, 2000),
(38, '09-10-2019', 1113, 4, 300),
(39, '10-10-2019', 1113, 3, 799),
(40, '11-10-2019', 1113, 2, 858),
(41, '01-10-2020', 1111, 1, 400),
(42, '02-10-2020', 1111, 2, 429),
(43, '03-10-2020', 1112, 7, 1300),
(44, '04-10-2020', 1113, 7, 2600),
(45, '05-10-2020', 1114, 7, 1300),
(46, '06-10-2020', 1114, 7, 1300),
(47, '07-10-2020', 1114, 9, 800)
Tried this:
SELECT PCY.Name, PCY.Year, PCY.SEQNUM
FROM (SELECT P.Name AS Name, Extract('Year' from S.Date) AS YEAR, COUNT(P.Productid) AS CNT,
RANK() OVER (PARTITION BY Extract('Year' from S.Date) ORDER BY COUNT(P.Productid) DESC) AS RANK
FROM Sales S inner JOIN
Product P
ON S.Productid = P.Productid
) PCY
WHERE PCY.RANK <= 10;
I am seeing this error:
ERROR: column "p.name" must appear in the GROUP BY clause or be used in an aggregate function
LINE 2: FROM (SELECT P.Name AS Name, Extract('Year' from S.Date) AS ...
^
SQL state: 42803
Character: 52
I don't understand why you don't want to use an aggregate function when you have to aggregate over your data. This query works fine, without any issues on the GROUP BY:
WITH stats AS (
SELECT EXTRACT
( YEAR FROM DATE ) AS y,
P.productid,
P.NAME,
COUNT ( * ) numbers_sold,
RANK ( ) OVER ( PARTITION BY EXTRACT ( YEAR FROM DATE ) ORDER BY COUNT ( * ) DESC ) r
FROM
product
P JOIN sales S ON S.Productid = P.Productid
GROUP BY
1,2
)
SELECT y
, name
, numbers_sold
FROM stats
WHERE r <= 10;
This works because the productid is the primary key that has a functional dependency to the product name.
By the way, tested on version 12, but it should work on older and newer versions as well.

How can I write correct query?

WITH Encashment AS (
SELECT T.MachineId, T.Amount, CAST(Occured AS DATETIME) AS Occured
FROM (VALUES
(1, 101, '2017-10-20 09:36:40.057')
,(1, 203, '2017-10-14 12:36:30.081')
,(1, 400, '2017-10-11 04:17:38.023')
) AS T(MachineId, Amount, Occured)
), MoneyAccepted AS (
SELECT T.MachineId, T.Amount, CAST(Occured AS DATETIME) AS Occured
FROM (VALUES
(1, 1, '2017-10-15 09:36:40.057')
,(1, 100, '2017-10-16 12:36:30.081')
,(1, 100, '2017-10-12 16:17:38.023')
,(1, 1, '2017-10-13 09:37:47.057')
,(1, 1, '2017-10-13 09:37:47.057')
,(1, 1, '2017-10-12 15:37:47.057')
,(1, 100, '2017-09-15 12:37:31.081')
,(1, 100, '2017-09-15 16:37:31.081')
,(1, 100, '2017-09-16 13:37:31.081')
,(1, 100, '2017-09-17 13:37:31.081')
) AS T(MachineId, Amount, Occured)
)
I can get Amount among two encashment.(Select Amount from Encashment).
But, I want to get amount from MoneyAccepted for every Encashment.
For example: Encashment happened in 20-10-2017,till this dateTime accepted 101(100(2017-10-16 12:36:30.081)+1(2017-10-15 09:36:40.057)) money.
How can I get that?
Thanks in advance!
I think what you are looking for is:
DECLARE #Encashment AS TABLE (MachineID INT, Amount INT, Occured DATETIME2)
DECLARE #MoneyAccepted AS TABLE (MachineID INT, Amount INT, Occured DATETIME2)
INSERT #Encashment (MachineID, Amount, Occured)
VALUES (1, 101, '20171020 09:36:40.057')
, (1, 203, '20171014 12:36:30.081')
, (1, 400, '20171011 04:17:38.023')
INSERT #MoneyAccepted (MachineID, Amount, Occured)
VALUES (1, 1, '20171015 09:36:40.057')
, (1, 100, '20171016 12:36:30.081')
, (1, 100, '20171012 16:17:38.023')
, (1, 100, '20171014 09:17:38.023')
, (1, 1, '20171013 09:37:47.057')
, (1, 1, '20171013 09:37:47.057')
, (1, 1, '20171012 15:37:31.081')
SELECT E.Occured AS Encashment_Occured
, SUM(MA.Amount) AS SUM_Amount
FROM #MoneyAccepted AS MA
INNER JOIN (
SELECT MachineID
, Amount
, Occured
, LAG(Occured) OVER(PARTITION BY MachineID ORDER BY Occured) AS Previous_Occured
FROM #Encashment
) AS E
ON E.MachineID = MA.MachineID
AND E.Occured > MA.Occured
AND E.Previous_Occured <= MA.Occured
GROUP BY E.Occured
Result:
+-----------------------------+------------+
| Encashment_Occured | SUM_Amount |
+-----------------------------+------------+
| 2017-10-14 12:36:30.0810000 | 203 |
| 2017-10-20 09:36:40.0570000 | 101 |
+-----------------------------+------------+
This uses LAG, which was introduced in sql server 2012, in order to get the range of applicable dates in a single row.
Please edit your question, remove html and use plain text for sample data.
I think you could use CROSS APPLY.
Try this:
WITH Encashment AS (
SELECT T.MachineId, T.Amount, CAST(Occured AS DATETIME) AS Occured
FROM (VALUES
(1, 101, '2017-10-20 09:36:40.057')
,(1, 203, '2017-10-14 12:36:30.081')
,(1, 400, '2017-10-11 04:17:38.023')
) AS T(MachineId, Amount, Occured)
), MoneyAccepted AS (
SELECT T.MachineId, T.Amount, CAST(Occured AS DATETIME) AS Occured
FROM (VALUES
(1, 1, '2017-10-15 09:36:40.057')
,(1, 100, '2017-10-16 12:36:30.081')
,(1, 100, '2017-10-12 16:17:38.023')
,(1, 1, '2017-10-13 09:37:47.057')
,(1, 1, '2017-10-13 09:37:47.057')
,(1, 1, '2017-10-12 15:37:47.057')
,(1, 100, '2017-09-15 12:37:31.081')
,(1, 100, '2017-09-15 16:37:31.081')
,(1, 100, '2017-09-16 13:37:31.081')
,(1, 100, '2017-09-17 13:37:31.081')
) AS T(MachineId, Amount, Occured)
)
SELECT M.*, EN.*
FROM MoneyAccepted AS M
CROSS APPLY (
SELECT TOP (1) E.* FROM Encashment AS E
WHERE E.MachineId = M.MachineId AND E.Occured > M.Occured
ORDER BY E.Occured ASC
) AS EN

"Missing closing parenthesis" SQL

keep getting an error on the line where I put my values in, it specifically appears on the "15" of "15 Water Road", same line as VALUES. The error states:
"Syntax Error: Missing closing parenthesis".
Also when I try run it on XAMPP I get:
"You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'Water Rd, 0412345678, 750, 3), (2, Smith, 2, 14 Water Rd, 0412345679, 400, 4),"
Extremely new to DB's and SQL, any help would be much appreciated.
CREATE TABLE Customer_info (
CustomerID int,
LastName varchar(50),
Username int,
Address varchar(777),
PhoneNumber int,
TotalSpent int,
OrdersCompleted int
);
INSERT INTO Customer_info (CustomerID, LastName, Username, Address, PhoneNumber, TotalSpent, OrdersCompleted)
VALUES (1, Mason, 1, 15 Water Rd, 0412345678, 750, 3), (2, Smith, 2, 14 Water Rd, 0412345679, 400, 4),
(3, Lens, 3, 1 Water rd, 0412345671, 700, 7), (4, Marks, 4, 5 Fire Rd, 0412345672, 100, 1),
(5, Barr, 5, 19 Fire Rd, 0412345673, 500, 1), (6, Blok, 6, 21 Fire Rd, 0412345674, 1000, 10),
(7, Pume, 7, 21 Water Rd, 0412345675, 1000, 2), (8, Po, 8, 77 Earth Rd, 0412345676, 1000, 4),
(9, Adid, 9, 20 Earth Rd, 0412345677, 2, 200), (10, Lew, 10, 6 Earth Rd, 0412345679, 250, 1),
(11, Chia, 11, 1 Earth Rd, 0412345681, 150, 1), (12, Barrett, 12, 11 Wind Rd, 0412345682, 450, 9),
(13, James, 13, 9 Wind Rd, 0412345683, 250, 10), (14, Foop, 14, 2 Window St, 0412345684, 200, 10),
(15, Watch, 15, 8 Window St, 0412345685, 1200, 1), (16, Irving, 16, 11 Window St, 0412345686, 1400, 2),
(17, Jones, 17, 22 Window St, 0412345687, 1600, 2);
You should enclose the non-numeric values with (single) quotation marks.
The error here is slightly cryptic, but what's going on is that when the DB engine tries to parse what you're inserting, it doesn't know how these characters should be used - for example, whether the commas are part of the string or not (remember, it doesn't know that an address field is unlikely to contain commas).
You are not enclosing varchar elements inside quotes, hence address values is read wrongly and error is thrown, enclose last name and address in quotes and insert this would solve the problem
INSERT INTO Customer_info (CustomerID, LastName, Username, Address, PhoneNumber, TotalSpent, OrdersCompleted) VALUES (1, 'Mason', 1, '15 Water Rd', 0412345678, 750, 3),(1, 'Mason', 1, '15 Water Rd', 0412345678, 750, 3);
<?php
CREATE TABLE Customer_info (
CustomerID int,
LastName varchar(50),
Username int,
Address varchar(777),
PhoneNumber int,
TotalSpent int,
OrdersCompleted int
);
"INSERT INTO Customer_info('CustomerID','LastName','Username','Address','PhoneNumber','TotalSpent','OrdersCompleted')VALUES
('1', 'Mason','1','15 Water Rd','0412345678','750','3'),
('2', 'Smith','2','14 Water Rd','0412345679','400','4'),
('3', 'Lens','3','1 Water rd','0412345671','700','7'),
('4', 'Marks','4','5 Fire Rd','0412345672','100','1'),
('5', 'Barr', '5', '19 Fire Rd','0412345673','500','1'),
('6','Blok','6','21 Fire Rd','0412345674','1000','10'),
('7','Pume','7','21 Water Rd','0412345675','1000','2'),
('8','Po','8','77 Earth Rd','0412345676','1000','4'),
('9','Adid','9','20 Earth Rd','0412345677','2','200'),
('10', 'Lew', '10','6 Earth Rd','0412345679', '250', '1'),
('11','Chia','11','1 Earth Rd','0412345681','150','1'),
('12','Barrett','12','11 Wind Rd','0412345682','450','9'),
('13','James','13','9 Wind Rd','0412345683','250','10'),
('14', 'Foop','14','2 Window St','0412345684','200', '10'),
('15','Watch','15','8 Window St','0412345685','1200','1'),
('16', 'Irving', '16', '11 Window St', '0412345686', '1400', '2'),
('17', 'Jones', '17', '22 Window St', '0412345687', '1600', '2')";
?>

Performing a subquery using values from a column in Oracle

I'm trying to create a calculated column in SQL. Basically I need to get a set of distinct dates and determine how many customers there are in the population on that particular date. The result should be something like:
Date______| Customers
2016-01-01 | 1
2016-01-01 | 2
2016-01-05 | 3
2016-02-09 | 4
etc.
I created a sample database & data (using MySQL as I don't have permission to create tables in our Oracle dbs) with the following script:
create database customer_example;
use customer_example;
create table customers (
customer_id int not null primary key,
customer_name varchar(255) not null,
term_date DATE);
create table employee (
employee_id int not null primary key,
employee_name varchar(255) not null);
create table cust_emp (
ce_id int not null AUTO_INCREMENT,
emp_id int not null,
cust_id int not null,
start_date date,
end_date date,
deleted_yn boolean,
primary key (emp_id, cust_id, ce_id),
foreign key (cust_id) references customers(customer_id),
foreign key (emp_id) references employee(employee_id));
insert into customers (customer_id, customer_name)
values (1, 'Bobby Tables'), (2, 'Grover Cleveland'), (3, 'Chester Arthur'), (4, 'Jan Bush'), (5, 'Emanuel Porter'), (6, 'Darren King'), (7, 'Casey Mcguire'), (8, 'Robin Simpson'), (9, 'Robin Tables'), (10, 'Mitchell Arnold');
insert into customers (customer_id, customer_name, term_date)
values (11, 'Terrell Graves', '2017-01-01'), (12, 'Richard Wagner', '2016-10-31'), (13, 'Glenn Saunders', '2016-11-19'), (14, 'Bruce Irvin', '2016-03-11'), (15, 'Glenn Perry','2016-06-06'), (16, 'Hazel Freeman', '2016-07-10'),
(17, 'Martin Freeman', '2016-02-11'), (18, 'Morgan Freeman', '2017-02-01'), (19, 'Dirk Drake', '2017-01-12'), (20, 'Fraud Fraud', '2016-12-31');
insert into employee (employee_id, employee_name)
values (1000, 'Cedrick French'), (1001, 'Jane Phillips'), (1002, 'Brian Green'), (1003, 'Shawn Brooks'), (1004, 'Clarence Thomas');
insert into cust_emp (emp_id, cust_id, start_date, end_date)
values (1000, 1, '2016-01-01', '2016-02-01'), (1000, 1, '2016-02-01', '2016-02-01'), (1000, 2,'2016-01-05', '2016-01-16'),(1000, 3,'2016-02-09', '2016-03-14'),(1000, 4,'2016-03-20', '2016-04-23'),
(1000, 5,'2016-01-01', '2016-01-16'),(1000, 6,'2016-01-01', '2016-01-16'),(1004, 7, '2016-01-14', '206-01-16'),
(1004, 8, '2016-01-13', '2016-01-16'),(1004, 9, '2016-01-05', '2016-01-16'), (1003, 12, '2016-04-21', '2016-11-30');
insert into cust_emp (emp_id, cust_id, start_date, deleted_yn)
values (1002, 11, '2016-04-10', TRUE),(1003, 10, '2016-01-16', FALSE), (1004, 12, '2016-04-20', TRUE), (1004, 12, '2016-04-19', FALSE), (1003, 13, '2016-06-06', TRUE), (1002, 14, '2016-06-10', TRUE),
(1004, 15, '2016-03-25', TRUE), (1004, 17, '2016-01-02', TRUE), (1004, 18, '2017-01-01', TRUE), (1004, 19, '2016-11-13', TRUE), (1004, 20, '2016-03-10', TRUE), (1004, 16, '2016-05-13', TRUE);
insert into cust_emp (emp_id, cust_id, start_date)
values (1002, 1, '2016-02-01'), (1004, 2, '2016-01-16'),(1003, 3, '2016-03-14'),(1002, 4, '2016-04-23'),(1004, 5, '2016-01-16'),(1002, 6, '2016-01-16'),(1004, 7, '2016-01-16'),
(1004, 8, '2016-01-16'),(1002, 9, '2016-01-16'), (1004, 10, '2016-01-16');
The following SQL works fine in MySQL but when I try it in Oracle, I get an 'invalid identifier' on 'dates':
select distinct(ce.start_date) as dates,
(select count(distinct(c.customer_id))
from customers c
inner join cust_emp ce on c.customer_id = ce.cust_id
where ce.start_date < dates
and (ce.end_date > dates or (ce.deleted_yn = false or ce.deleted_yn is null))
and (c.term_date > dates or c.term_date is null)
)
from cust_emp as ce;
It seems as though this is because the dates is too far in a subquery. I've tried a CTE as well, but that seems to have the same issue as it gave the same error. How can I re-write this so that I can assess how many customers were there for each date in Oracle?
Huh?
Isn't this what you want?
select ce.dates as dates, count(distinct c.customer_id)
from cust_emp ce join
customers c
on c.customer_id = ce.cust_id
where ce.start_date < ce.dates and
(ce.end_date > ce.dates or ce.deleted_yn = false or ce.deleted_yn is null) and
(c.term_date > ce.dates or c.term_date is null)
group by ce.dates
order by ce.dates;
I don't really understand the use of the subquery with select distinct. The logic you describe is more easily understood as a simple aggregation.
I'm not sure where dates comes from. It is not in your data model, but it is in your sample query.

Constraints w/ Recursive Postgres Query

I'm looking to skip a certain city as I traverse my data. Currently, this query works to find all available flights from SLC to LA, including trips with layovers. You'll see this in the picture below.
However, I want to be able to exclude certain cities in a flight plan. For example, if Montreal is a stop between SLC and LA, that trip wouldn't be considered.
I've tried putting various things in the WHERE clauses, but to no avail. Any other suggestions? Sample data an queries are given below.
WITH RECURSIVE segs AS (
SELECT f0.flight_num::text as flight
, src_city, dest_city
, dep_time AS departure
, arr_time AS arrival
, airfare, mileage
, 1 as hops
, (arr_time - dep_time)::interval AS total_time
, '00:00'::interval as waiting_time
FROM flight f0
WHERE src_city = 'SLC' -- <SRC_CITY>
UNION ALL
SELECT s.flight || '-->' || f1.flight_num::text as flight
, s.src_city, f1.dest_city
, s.departure AS departure
, f1.arr_time AS arrival
, s.airfare + f1.airfare as airfare
, s.mileage + f1.mileage as mileage
, s.hops + 1 AS hops
, s.total_time + (f1.arr_time - f1.dep_time)::interval AS total_time
, s.waiting_time + (f1.dep_time - s.arrival)::interval AS waiting_time
FROM segs s
JOIN flight f1
ON f1.src_city = s.dest_city
AND f1.dep_time > s.arrival -- you can't leave until you are there
)
SELECT *
FROM segs
WHERE dest_city = 'LA' -- <DEST_CITY>
ORDER BY airfare desc
;
create table flight
( flight_num BIGSERIAL PRIMARY KEY
, src_city varchar
, dest_city varchar
, dep_time TIME
, arr_time TIME
, airfare INTEGER
, mileage INTEGER
);
insert into flight VALUES
(101, 'Montreal', 'NY', '05:30', '06:45', 180, 170),
(102, 'Montreal', 'Washington', '01:00', '02:35', 100, 180),
(103, 'NY', 'Chicago', '08:00', '10:00', 150, 300),
(105, 'Washington', 'KansasCity', '06:00', '08:45', 200, 600),
(106, 'Washington', 'NY', '12:00', '13:30', 50, 80),
(107, 'Chicago', 'SLC', '11:00', '14:30', 220, 750),
(110, 'KansasCity', 'Denver', '14:00', '15:25', 180, 300),
(111, 'KansasCity', 'SLC', '13:00', '15:30', 200, 500),
(112, 'SLC', 'SanFran', '18:00', '19:30', 85, 210),
(113, 'SLC', 'LA', '17:30', '19:00', 185, 230),
(115, 'Denver', 'SLC', '15:00', '16:00', 75, 300),
(116, 'SanFran', 'LA', '22:00', '22:30', 50, 75),
(118, 'LA', 'Seattle', '20:00', '21:00', 150, 450);
To exclude certain cities from the flight plan you should add where clauses at 2 places in your query as following:
Right after src_city condition
...
WHERE src_city = 'SLC' -- <SRC_CITY>
AND dest_city <> 'Montreal'
...
In the recursive join condition
...
AND f1.dep_time > s.arrival -- you can't leave until you are there
AND f1.dest_city <> 'Montreal'
...
I don't have Postgress but I tried it with SQL server and it seems to work.