I currently have the following:
TABLE "QUARTO":
CREATE TABLE Quarto (
Id number(2) NOT NULL,
LotacaoMaxima number(1) NOT NULL,
TipoQuartoId number(1) NOT NULL,
NumeroQuartoNumSequencial number(3) NOT NULL,
NumeroQuartoAndarId varchar2(1) NOT NULL,
PRIMARY KEY (Id));
TABLE RESERVA:
CREATE TABLE Reserva (
Id number(3) NOT NULL,
ClienteNif number(9) NOT NULL,
QuartoId number(2) NOT NULL,
DataInicio date NOT NULL,
DataFim date NOT NULL,
NumPessoas number(1) NOT NULL,
Estado varchar2(15) NOT NULL,
DataCancelamento date,
PRIMARY KEY (Id));
And some data I have in both is:
QUARTO:
| ID | LOTACAOMAXIMA | TIPOQUARTOID | NUMEROQUARTONUMSEQUENCIAL | NUMEROQUARTOANDARID |
| :--- | :--- | :--- | :--- | :--- |
| 1 | 1 | 1 | 1 | 1 |
| 2 | 1 | 1 | 2 | 1 |
| 3 | 1 | 1 | 3 | 1 |
| 4 | 1 | 1 | 4 | 1 |
| 5 | 1 | 1 | 5 | 1 |
| 6 | 1 | 1 | 6 | 1 |
| 7 | 1 | 1 | 7 | 1 |
| 8 | 1 | 1 | 8 | 1 |
| 9 | 1 | 1 | 9 | 1 |
| 10 | 1 | 1 | 10 | 1 |
| 11 | 2 | 2 | 11 | 1 |
| 12 | 2 | 2 | 12 | 1 |
| 13 | 2 | 2 | 13 | 1 |
| 14 | 2 | 2 | 14 | 1 |
| 15 | 2 | 2 | 15 | 1 |
| 16 | 2 | 2 | 16 | 1 |
| 17 | 2 | 2 | 17 | 1 |
| 18 | 2 | 2 | 18 | 1 |
| 19 | 2 | 2 | 19 | 1 |
| 20 | 2 | 2 | 20 | 1 |
RESERVA:
| ID | CLIENTENIF | QUARTOID | DATAINICIO | DATAFIM | NUMPESSOAS | ESTADO | DATACANCELAMENTO |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 1 | 296837970 | 11 | 2020-06-01 00:00:00 | 2020-06-12 00:00:00 | 1 | Finalizada | NULL |
| 2 | 275784703 | 17 | 2020-06-13 00:00:00 | 2020-06-21 00:00:00 | 1 | Finalizada | NULL |
| 3 | 220347654 | 11 | 2020-07-07 00:00:00 | 2020-07-15 00:00:00 | 2 | Finalizada | NULL |
| 4 | 294772545 | 12 | 2020-08-01 00:00:00 | 2020-08-15 00:00:00 | 2 | Finalizada | NULL |
| 5 | 220347654 | 3 | 2020-01-01 00:00:00 | 2020-01-16 00:00:00 | 1 | Finalizada | NULL |
WITH CONTAGEM_QUARTO_POR_ID AS (SELECT q.ID, COUNT(r.QUARTOID) AS NUM_RESERVAS, q.TIPOQUARTOID, q.NUMEROQUARTOANDARID
FROM RESERVA r
INNER JOIN QUARTO q on q.ID = r.QUARTOID
GROUP BY q.ID, q.TIPOQUARTOID, q.NUMEROQUARTOANDARID)
SELECT t.TIPOQUARTOID, t.NUMEROQUARTOANDARID, MAX(t.NUM_RESERVAS) AS MAX
FROM CONTAGEM_QUARTO_POR_ID t
GROUP BY t.TIPOQUARTOID, t.NUMEROQUARTOANDARID
And the Output is the following:
| TIPOQUARTOID | NUMEROQUARTOANDARID | MAX |
| :----------- | :------------------ | :-- |
| 1 | 2 | 2 |
| 2 | 1 | 8 |
| 1 | 1 | 1 |
I want to, alongside the data I currently have, also show toe ID of each row, but when I add the t.ID to the SELECT it forces me to add it to GROUP BY and the output is this:
| TIPOQUARTOID | NUMEROQUARTOANDARID | MAX | ID |
| :----------- | :------------------ | :-- | :- |
| 2 | 1 | 2 | 11 |
| 1 | 1 | 1 | 1 |
| 1 | 1 | 1 | 3 |
| 1 | 2 | 2 | 21 |
| 2 | 1 | 1 | 17 |
| 2 | 1 | 1 | 12 |
| 2 | 1 | 8 | 16 |
I only wan to get the max value and the ID associated to that MAX.
You can use the KEEP clause in your query without changing it much as follows:
WITH CONTAGEM_QUARTO_POR_ID AS
(SELECT q.ID, COUNT(r.QUARTOID) AS NUM_RESERVAS, q.TIPOQUARTOID, q.NUMEROQUARTOANDARID
FROM RESERVA r
INNER JOIN QUARTO q on q.ID = r.QUARTOID
GROUP BY q.ID, q.TIPOQUARTOID, q.NUMEROQUARTOANDARID)
SELECT t.TIPOQUARTOID, t.NUMEROQUARTOANDARID, MAX(t.NUM_RESERVAS) AS MAX,
max(t.ID) keep(dense_rank first order by t.NUM_RESERVAS desc nulls last) as ID -- this
FROM CONTAGEM_QUARTO_POR_ID t
GROUP BY t.TIPOQUARTOID, t.NUMEROQUARTOANDARID
You need MAX() OVER () Analytic function for NUM_RESERVAS column with PARTITION BY TIPOQUARTOID, NUMEROQUARTOANDARID in order to provide grouping for those columns within the partition by list such as
WITH CONTAGEM_QUARTO_POR_ID AS
(
SELECT q.ID,
COUNT(r.QUARTOID) AS NUM_RESERVAS,
q.TIPOQUARTOID,
q.NUMEROQUARTOANDARID
FROM RESERVA r
JOIN QUARTO q
on q.ID = r.QUARTOID
GROUP BY q.ID, q.TIPOQUARTOID, q.NUMEROQUARTOANDARID
), t AS
(
SELECT t.TIPOQUARTOID, t.NUMEROQUARTOANDARID, t.NUM_RESERVAS,
MAX(t.NUM_RESERVAS)
OVER (PARTITION BY t.TIPOQUARTOID, t.NUMEROQUARTOANDARID) AS MAX,
t.ID
FROM CONTAGEM_QUARTO_POR_ID t
)
SELECT TIPOQUARTOID, NUMEROQUARTOANDARID, NUM_RESERVAS, ID
FROM t
WHERE NUM_RESERVAS = MAX
or more straightforward by using HAVING clause
SELECT q.TIPOQUARTOID,
q.NUMEROQUARTOANDARID,
COUNT(r.QUARTOID) AS NUM_RESERVAS,
q.ID
FROM RESERVA r
JOIN QUARTO q
on q.ID = r.QUARTOID
GROUP BY q.ID, q.TIPOQUARTOID, q.NUMEROQUARTOANDARID
HAVING COUNT(r.QUARTOID) = q.TIPOQUARTOID
I wouldn't suggest two levels of aggregation. Just use window funtions:
WITH CONTAGEM_QUARTO_POR_ID AS (
SELECT q.ID, COUNT(*) AS NUM_RESERVAS, q.TIPOQUARTOID, q.NUMEROQUARTOANDARID,
ROW_NUMBER() OVER (PARTITION BY q.TIPOQUARTOID, q.NUMEROQUARTOANDARID ORDER BY COUNT(*) DESC) as seqnum
FROM RESERVA r INNER JOIN
QUARTO q
ON q.ID = r.QUARTOID
GROUP BY q.ID, q.TIPOQUARTOID, q.NUMEROQUARTOANDARID
)
SELECT cq.*
FROM CONTAGEM_QUARTO_POR_ID cq
WHERE seqnum = 1;
I think this would have slightly better performance than two aggregations (but it is worth checking).
One advantage of this approach is that it is more flexible. If you want ties, just change the ROW_NUMBER() to RANK() in the subquery.
Perhaps more importantly, ROW_NUMBER() is an "idiom" in SQL for returning one row (or a specific number of rows) per group. Learning how to use it is very valuable.
Related
I am working on the problem where I have to get the count of streak with max value, but to get the exact result I have to count that point as well where the streak breaks. My table looks like this
+-----------------+--------+-------+
| customer_number | Months | Flags |
+-----------------+--------+-------+
| 1 | 12 | 1 |
| 1 | 1 | 1 |
| 1 | 2 | 1 |
| 1 | 3 | 1 |
| 1 | 4 | 1 |
| 1 | 5 | 1 |
| 1 | 8 | 1 |
| 1 | 9 | 1 |
| 1 | 10 | 1 |
| 1 | 11 | 1 |
| 6 | 12 | 1 |
| 6 | 1 | 1 |
| 6 | 2 | 1 |
| 6 | 3 | 1 |
| 6 | 4 | 1 |
| 6 | 5 | 4 |
| 6 | 9 | 1 |
| 6 | 10 | 1 |
| 6 | 11 | 1 |
| 7 | 5 | 1 |
| 8 | 9 | 1 |
| 8 | 10 | 1 |
| 8 | 11 | 1 |
| 9 | 9 | 1 |
| 9 | 10 | 1 |
| 9 | 11 | 1 |
| 10 | 11 | 1 |
+-----------------+--------+-------+
and my desired output is
+----------+--------------------+
| Customer | Consecutive streak |
+----------+--------------------+
| 1 | 10 |
| 6 | 6 |
| 7 | 1 |
| 8 | 3 |
| 9 | 3 |
| 10 | 1 |
+----------+--------------------+
the code I have
SELECT customer_number, max(streak) max_consecutive_streak FROM (
SELECT customer_number, COUNT(*) as streak
FROM
(select *,
(row_number() over (order by customer_number) -
row_number() over (order by customer_number)
) as counts
from table1
) cc
group by customer_number, counts
)
GROUP BY 1;
It is working good but for customer_number 6 it returns 5 but I want it to be 6, means it should count 4 as well in its longest streak as the streak breaks at this point. Any idea how can I achieve that?
You can use a cte with row_number:
with cte(r, id, flag) as (
select row_number() over (order by c.customer_number), c.* from customers c
),
freq(id, t, f) as (
select c2.id, c2.f, count(*) from
(select c.id, (select sum(c1.flag!=c.flag) from cte c1 where c1.id=c.id and c1.r <= c.r) f from cte c)
c2 group by c2.id, c2.f
)
select id, max(f) from freq group by id;
Given a table of roles, companies and a employee table where we store for each employee which role he/she has at each company.
I'm trying to create a view which indicates for each combination of role and company and employee by a ‘Y’ or ‘N’ in the “checked_yn” column, whether this employee has this role at this company.
company table
----------------
|ID | name |
-----------------
| 1 | A |
| 2 | B |
-----------------
roles table
-------------
|ID | role |
-------------
| 1 | X |
| 2 | Y |
| 3 | Z |
-------------
employee table
----------------------------------------------
|ID | company_id | role_id | employee_log_id |
---------------------------------------------|
| 1 | 1 | 1 | 1 |
| 2 | 1 | 2 | 1 |
| 3 | 2 | null | 1 |
----------------------------------------------
The desired outcome is this:
EMPLOYEE_ROLES_VW view
------------------------------------------------------------------------
|Id |company_id | role_id | Checked_yn | employee_id | employee_log_id |
|----------------------------------------------------------------------|
| 1 | 1 | 1 | Y | 1 | 1 |
| 2 | 1 | 2 | Y | 2 | 1 |
| 3 | 1 | 3 | N | null | 1 |
| 4 | 2 | 1 | N | null | 1 |
| 5 | 2 | 2 | N | null | 1 |
| 6 | 2 | 3 | N | null | 1 |
------------------------------------------------------------------------
This is my current query:
with ROLES_X_COMP as (SELECT ROL.ID AS X_ROLE_ID,
COM.ID AS X_COMPANY_ID,
FROM ROLES ROL
CROSS JOIN COMPANY COM)
SELECT ROWNUM AS ID,
EMP.ID AS SMCR_EMPLOYEE_ID,
EMP.EMPLOYEE_LOG_ID AS EMPLOYEE_LOG_ID,
ROLES_X_COMP.X_ROLE_ID ,
EMP.ROLE_ID AS ROLE_ID,
ROLES_X_COMP.X_COMPANY_ID,
EMP.COMPANY_ID AS COMPANY_ID,
CASE
WHEN ROLES_X_COMP.X_ROLE_ID = SE.ROLE_ID AND ROLES_X_COMP.X_COMPANY_ID =
SE.COMPANY_ID THEN 'Y'
ELSE 'N' END AS CHECKED_YN
FROM ROLES_X_COMP
LEFT OUTER JOIN EMPLOYEE EMP ON ROLES_X_COMP.X_COMPANY_ID = EMP.COMPANY_ID
Because of the join on EMPLOYEE “finds” the company with id=1 twice it joins twice with the cross join of role and company table. So I'm getting this result:
------------------------------------------------------------------------
|Id |company_id | role_id | Checked_yn | employee_id | employee_log_id |
|----------------------------------------------------------------------|
| 1 | 1 | 1 | Y | 1 | 1 |
| 2 | 1 | 2 | N | 1 | 1 |
| 3 | 1 | 3 | N | 1 | 1 |
| 4 | 1 | 1 | N | 2 | 1 |
| 5 | 1 | 2 | Y | 2 | 1 |
| 6 | 1 | 3 | N | 2 | 1 |
| 7 | 2 | 1 | N | 3 | 1 |
| 8 | 2 | 2 | N | 3 | 1 |
| 9 | 2 | 3 | N | 3 | 1 |
------------------------------------------------------------------------
I think a JOIN might be the wrong option here and a UNION more appropriate but I can't figure it out.
Use a partitioned outer join:
Query:
SELECT ROWNUM AS id,
e.company_id,
r.id AS role_id,
NVL2( e.role_id, 'Y', 'N' ) AS CheckedYN,
e.role_id AS employee_id,
e.employee_log_id
FROM roles r
LEFT OUTER JOIN
employee e
PARTITION BY ( e.company_id, e.employee_log_id )
ON ( r.id = e.role_id )
or (depending on how you want to partition and join the data):
SELECT ROWNUM AS id,
c.id AS company_id,
r.id AS role_id,
NVL2( e.role_id, 'Y', 'N' ) AS CheckedYN,
e.role_id AS employee_id,
e.employee_log_id
FROM roles r
CROSS JOIN
company c
LEFT OUTER JOIN
employee e
PARTITION BY ( e.employee_log_id )
ON ( c.id = e.company_id AND r.id = e.role_id )
Output:
Both output the same for the test data but may give differing results depending on your actual data.
ID | COMPANY_ID | ROLE_ID | CHECKEDYN | EMPLOYEE_ID | EMPLOYEE_LOG_ID
-: | ---------: | ------: | :-------- | ----------: | --------------:
1 | 1 | 1 | Y | 1 | 1
2 | 1 | 2 | Y | 2 | 1
3 | 1 | 3 | N | null | 1
4 | 2 | 1 | N | null | 1
5 | 2 | 2 | N | null | 1
6 | 2 | 3 | N | null | 1
db<>fiddle here
AND ROLES_X_COMP.X_ROLE_ID = EMP.ROLE_ID
Is missing at the end of your query
But the outcome will be
EMPLOYEE_ROLES_VW view
------------------------------------------------------------------------
|Id |company_id | role_id | Checked_yn | employee_id | employee_log_id |
|----------------------------------------------------------------------|
| 1 | 1 | 1 | Y | 1 | 1 |
| 2 | 1 | 2 | Y | 2 | 1 |
| 3 | 1 | 3 | N | null | null |
| 4 | 2 | 1 | N | null | null |
| 5 | 2 | 2 | N | null | null |
| 6 | 2 | 3 | N | null | null |
------------------------------------------------------------------------
I have the following table
postgres=# select * from joins_example;
user_id | price | id | email
---------+--------+----+--------------------------
1 | $30.00 | |
5 | $50.00 | |
7 | $20.00 | |
| | 1 | hadil#example.com
| | 5 | saiid#example.com
| | 2 | fahir#example.com
6 | $60.00 | 6 | oma#example.com
8 | $40.00 | 8 | nasim#example.com
| | 8 | nasim.hassan#example.com
9 | $40.00 | 9 | farah#example.com
9 | $70.00 | |
10 | $80.00 | | majid#example.com
| | 10 | majid.seif#example.com
(13 rows)
A self inner join between user_id and id produces
postgres=# select * from joins_example as x inner join joins_example as y on x.user_id = y.id;
user_id | price | id | email | user_id | price | id | email
---------+--------+----+-------------------+---------+--------+----+--------------------------
1 | $30.00 | | | | | 1 | hadil#example.com
5 | $50.00 | | | | | 5 | saiid#example.com
6 | $60.00 | 6 | oma#example.com | 6 | $60.00 | 6 | oma#example.com
8 | $40.00 | 8 | nasim#example.com | | | 8 | nasim.hassan#example.com
8 | $40.00 | 8 | nasim#example.com | 8 | $40.00 | 8 | nasim#example.com
9 | $40.00 | 9 | farah#example.com | 9 | $40.00 | 9 | farah#example.com
9 | $70.00 | | | 9 | $40.00 | 9 | farah#example.com
10 | $80.00 | | majid#example.com | | | 10 | majid.seif#example.com
(8 rows)
What I want is either:
user_id | price | id | email | user_id | price | id | email
---------+--------+----+-------------------+---------+--------+----+--------------------------
7 | $50.00 | | | | | |
| | | | | | 2 | fahir#example.com
or:
user_id | price | id | email | user_id | price | id | email
---------+--------+----+-------------------+---------+--------+----+--------------------------
| | | | 7 | $50.00 | |
| | 2 | fahir#example.com | | | |
Even
user_id | price | id | email
---------+--------+----+--------------------------
5 | $50.00 | |
| | 2 | fahir#example.com
would be a good start.
Specifically I want to know how to select only the rows from joins_example with user_ids or ids that don't exist in the inner join.
You could consider an approach that uses correlated subqueries with NOT EXISTS conditions:
select *
from joins_example as x
where
(
x.user_id is not null
and not exists (
select 1 from joins_example y where x.user_id = y.id
)
)
or (
x.id is not null
and not exists (
select 1 from joins_example y where x.id = y.user_id
)
)
Demo on DB Fiddle:
| user_id | price | id | email |
| ------- | ----- | --- | ----------------- |
| 7 | 20.00 | | |
| | | 2 | fahir#example.com |
SELECT *
FROM joins_example j
WHERE (j.user_id IS NULL AND j.id IS NULL)
OR (j.user_id IS NOT NULL AND NOT EXISTS(SELECT 1 FROM joins_example j2 WHERE j2.id = j.user_id))
OR (j.id IS NOT NULL AND NOT EXISTS(SELECT 1 FROM joins_example j2 WHERE j2.user_id = j.id));
SELECT *
FROM joins_example AS w
LEFT JOIN (
select x.user_id
from joins_example as x
inner join joins_example as y on x.user_id = y.id
) AS z ON z.user_id = w.user_id or z.user_id = w.id
WHERE z.user_id IS NULL;
Is a good enough start i.e.
user_id | price | id | email | user_id
---------+--------+----+-------------------+---------
7 | $20.00 | | |
| | 2 | fahir#example.com |
For this Table:
+----+--------+-------+
| ID | Status | Value |
+----+--------+-------+
| 1 | 1 | 4 |
| 2 | 1 | 7 |
| 3 | 1 | 9 |
| 4 | 2 | 1 |
| 5 | 2 | 7 |
| 6 | 1 | 8 |
| 7 | 1 | 9 |
| 8 | 2 | 1 |
| 9 | 0 | 4 |
| 10 | 0 | 3 |
| 11 | 0 | 8 |
| 12 | 1 | 9 |
| 13 | 3 | 1 |
+----+--------+-------+
I need to sum sequential groups with the same Status to produce this result.
+--------+------------+
| Status | Sum(Value) |
+--------+------------+
| 1 | 20 |
| 2 | 8 |
| 1 | 17 |
| 2 | 1 |
| 0 | 15 |
| 1 | 9 |
| 3 | 1 |
+--------+------------+
How can I do that in SQL Server?
NB: The values in the ID column are contiguous.
Per the tag I added to your question this is a gaps and islands problem.
The best performing solution will likely be
WITH T
AS (SELECT *,
ID - ROW_NUMBER() OVER (PARTITION BY [STATUS] ORDER BY [ID]) AS Grp
FROM YourTable)
SELECT [STATUS],
SUM([VALUE]) AS [SUM(VALUE)]
FROM T
GROUP BY [STATUS],
Grp
ORDER BY MIN(ID)
If the ID values were not guaranteed contiguous as stated then you would need to use
ROW_NUMBER() OVER (ORDER BY [ID]) -
ROW_NUMBER() OVER (PARTITION BY [STATUS] ORDER BY [ID]) AS Grp
Instead in the CTE definition.
SQL Fiddle
Basically, I have a table with all the bus stops of a route with the time_from_start value, that helps to put them in a good order.
CREATE TABLE `api_routestop` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`route_id` int(11) NOT NULL,
`station_id` varchar(10) NOT NULL,
`time_from_start` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `api_routestop_4fe3422a` (`route_id`),
KEY `api_routestop_15e3331d` (`station_id`)
)
I want to return for each stop of a line the time to go to the next stop.
I tried with this QUERY :
SELECT r1.station_id, r2.station_id, r1.route_id, COUNT(*), (r2.time_from_start - r1.time_from_start) as time
FROM api_routestop r1
LEFT JOIN api_routestop r2 ON r1.route_id = r2.route_id AND r1.id <> r2.id
GROUP BY r1.station_id
HAVING time >= 0
ORDER BY r1.route_id, r1.time_from_start, r2.time_from_start
But the group by seams not to work and the result looks like :
+------------+------------+----------+----------+------+
| station_id | station_id | route_id | COUNT(*) | time |
+------------+------------+----------+----------+------+
| Rub01 | Sal01 | 1 | 16 | 1 |
| Lyc02 | Sch02 | 2 | 17 | 2 |
| Paq01 | PoB01 | 3 | 15 | 1 |
| LaT02 | Gco02 | 4 | 16 | 1 |
| Sup01 | Tur01 | 5 | 132 | 1 |
| Oeu02 | CtC02 | 6 | 20 | 2 |
| Ver02 | Elo02 | 7 | 38 | 1 |
| Can01 | Mbo01 | 8 | 70 | 1 |
| Ver01 | Elo01 | 9 | 77 | 1 |
| MCH01 | for02 | 10 | 77 | 1 |
+------------+------------+----------+----------+------+
If I do that :
SELECT r1.station_id, r2.station_id, r1.route_id, COUNT(*), (r2.time_from_start - r1.time_from_start) as time
FROM api_routestop r1
LEFT JOIN api_routestop r2 ON r1.route_id = r2.route_id AND r1.id <> r2.id
GROUP BY r1.station_id, r2.station_id, r1.route_id
HAVING time >= 0
ORDER BY r1.route_id, r1.time_from_start, r2.time_from_start
I am approching :
+------------+------------+----------+----------+------+
| station_id | station_id | route_id | COUNT(*) | time |
+------------+------------+----------+----------+------+
| Rub01 | Sal01 | 1 | 1 | 1 |
| Rub01 | ARM01 | 1 | 1 | 2 |
| Rub01 | MaV01 | 1 | 1 | 4 |
| Rub01 | COl01 | 1 | 1 | 5 |
| Rub01 | Str01 | 1 | 1 | 6 |
| Rub01 | Jau01 | 1 | 1 | 7 |
| Rub01 | Cdp01 | 1 | 1 | 9 |
| Rub01 | Rep01 | 1 | 1 | 11 |
| Rub01 | CoT01 | 1 | 1 | 12 |
| Rub01 | Ctr01 | 1 | 1 | 14 |
| Rub01 | FLy01 | 1 | 1 | 15 |
| Rub01 | Lib01 | 1 | 1 | 17 |
| Rub01 | Bru01 | 1 | 1 | 18 |
| Rub01 | Sch01 | 1 | 1 | 20 |
| Rub01 | Lyc01 | 1 | 1 | 22 |
| Rub01 | Res01 | 1 | 1 | 24 |
| Sal01 | ARM01 | 1 | 1 | 1 |
| Sal01 | MaV01 | 1 | 1 | 3 |
| Sal01 | COl01 | 1 | 1 | 4 |
| Sal01 | Str01 | 1 | 1 | 5 |
| Sal01 | Jau01 | 1 | 1 | 6 |
| Sal01 | Cdp01 | 1 | 1 | 8 |
| Sal01 | Rep01 | 1 | 1 | 10 |
| Sal01 | CoT01 | 1 | 1 | 11 |
| Sal01 | Ctr01 | 1 | 1 | 13 |
| Sal01 | FLy01 | 1 | 1 | 14 |
| Sal01 | Lib01 | 1 | 1 | 16 |
| Sal01 | Bru01 | 1 | 1 | 17 |
| Sal01 | Sch01 | 1 | 1 | 19 |
| Sal01 | Lyc01 | 1 | 1 | 21 |
...
3769 rows in set (0.07 sec)
But what do I have to do to have only the first result for the same r1.station_id and r1.route_id ?
You're getting a lot of results back because your getting every stop joined to every other stop on the same route.
So you'll need to identify the "Next" stop as the stop that has the same route ID but has a minimum time from start later than the current one
Update Added routeId to the next_stop sub query to deal with the case of stations used in multiple routes
SELECT
r1.station_id,
r2.station_id,
r1.route_id,
r2.time_from_start - r1.time_from_start as time
FROM
api_routestop r1
INNER JOIN (SELECT
r1.station_id , r2.route_id, min(r2.time_from_start) next_time_from_start
FROM
api_routestop r1
LEFT JOIN api_routestop r2 ON r1.route_id = r2.route_id AND r1.id <> r2.id
and r2.time_from_start > r1.time_from_start
GROUP BY r1.Station_id, r2.route_id) next_stop
ON r1.Station_id = next_stop.station_id
and r1.route_id = next_stop.route_id
LEFT JOIN api_routestop r2
ON r2.time_from_start = r2.next_time_from_start
and r1.route_id = r2.route_id
AND r2.time_from_start > r1.time_from_start
SELECT station_id, coalesce(
(SELECT time_from_start
FROM api_routestop t2
WHERE t2.time_from_start > t1.time_from_start
AND t2.time_from_start <= (SELECT time_from_start FROM api_routestop t5 WHERE t5.station_id = '4' AND t5.route_id=t1.route_id)
AND t2.route_id = t1.route_id
ORDER BY t2.time_from_start LIMIT 1), time_from_start) - time_from_start AS difference
FROM api_routestop t1
WHERE t1.route_id = 1
AND t1.time_from_start >= (SELECT time_from_start FROM api_routestop t4 WHERE t4.station_id = '2' AND t4.route_id=t1.route_id)
AND t1.time_from_start <= (SELECT time_from_start FROM api_routestop t5 WHERE t5.station_id = '4' AND t5.route_id=t1.route_id)
ORDER BY time_from_start
Are you open to changing the schema? If so simply adding a column containing a sequential integer for all stops on route will make this query a lot easier and more efficient.
Failing that this will do it.
SELECT
station_id,
route_id,
time_from_start,
time_to_next
FROM
(
SELECT
station_id,route_id,time_from_start,
IF( #prev <> route_id, null, #time_from_start-time_from_start ) AS time_to_next,
#time_from_start := time_from_start,
#prev := route_id
FROM api_routestop
JOIN (SELECT #time_from_start := NULL, #prev := 0) AS r
ORDER BY route_id, time_from_start DESC
) t
ORDER BY route_id,time_from_start