Grouping by multiple ranges in SQL Server - sql

I tried to search for a solution, but with no success.
How can I group my table from looking like this:
from | to | zone
1 | 1 | 1
1 | 2 | 1
1 | 3 | 1
1 | 4 | 2
1 | 5 | 2
1 | 6 | 2
1 | 7 | 1
1 | 8 | 1
1 | 9 | 1
1 | 10 | 9
2 | 1 | 7
2 | 2 | 7
2 | 3 | 7
2 | 4 | 2
2 | 5 | 2
2 | 6 | 2
2 | 7 | 7
2 | 8 | 7
2 | 9 | 7
To look like this :
from | to | zone
1 | 1-3 | 1
1 | 4-6 | 2
1 | 7-9 | 1
1 | 10 | 9
2 | 1-3 | 7
2 | 4-6 | 2
2 | 7-9 | 7
Thank you for your help

One approach here is to use the difference of row numbers method, using to to column as one row number, and a row number over a partition using from and zone as the other row number. It is a bit difficult to explain why this works in so many words. It might be best to view the demo link below to explore the query.
WITH cte AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY [from], zone ORDER BY [to]) rn
FROM yourTable
)
SELECT
t.[from],
CONVERT(varchar(10), MIN(t.[to])) + '-' + CONVERT(varchar(10), MAX([to])) AS [to],
t.zone
FROM cte t
GROUP BY
t.[from],
t.zone,
t.[to] - t.rn
ORDER BY
t.[from],
MIN(t.[to]);
Demo here:
Rextester

This is generally called as Gaps and Islands problem. If you are using SQL Server 2012+ then
;WITH cte
AS (SELECT *,
Sum(CASE WHEN zone = prev_zone THEN 0 ELSE 1 END)OVER(partition BY [from] ORDER BY [to]) AS grp
FROM (SELECT *,
Lag(zone)OVER(partition BY [from] ORDER BY [to]) AS prev_zone
FROM yourtable ) cs ([from], [to], zone)) a)
SELECT [from],
[to] = Concat(Min([to]), '-', Max([to])),
zone = Min(zone)
FROM cte
GROUP BY [from],grp

;with mycte
AS
(
select
,[from]
,min([to]) minto
,max([to]) maxto
,[zone]
from
mytable
group by
[from]
,[zone]
)
[from] AS [from]
,concat(minto, '-', maxto) AS [to]
,[zone] AS [zone]
from
mycte

Related

SQL Query to apply a command to multiple rows

I am new to SQL and trying to write a statement similar to a 'for loop' in other languages and am stuck. I want to filter out rows of the table where for all of attribute 1, attribute2=attribute3 without using functions.
For example:
| Year | Month | Day|
| 1 | 1 | 1 |
| 1 | 2 | 2 |
| 1 | 4 | 4 |
| 2 | 3 | 4 |
| 2 | 3 | 3 |
| 2 | 4 | 4 |
| 3 | 4 | 4 |
| 3 | 4 | 4 |
| 3 | 4 | 4 |
I would only want the row
| Year | Month | Day|
|:---- |:------:| -----:|
| 3 | 4 | 4 |
because it is the only where month and day are equal for all of the values of year they share.
So far I have
select year, month, day from dates
where month=day
but unsure how to apply the constraint for all of year
-- month/day need to appear in aggregate functions (since they are not in the GROUP BY clause),
-- but the HAVING clause ensure we only have 1 month/day value (per year) here, so MIN/AVG/SUM/... would all work too
SELECT year, MAX(month), MAX(day)
FROM my_table
GROUP BY year
HAVING COUNT(DISTINCT (month, day)) = 1;
year
max
max
3
4
4
View on DB Fiddle
So one way would be
select distinct [year], [month], [day]
from [Table] t
where [month]=[day]
and not exists (
select * from [Table] x
where t.[year]=x.[year] and t.[month] <> x.[month] and t.[day] <> x.[day]
)
And another way would be
select distinct [year], [month], [day] from (
select *,
Lead([month],1) over(partition by [year] order by [month])m2,
Lead([day],1) over(partition by [year] order by [day])d2
from [table]
)x
where [month]=m2 and [day]=d2

How to create column for every single integer within a range in SQLite?

Here's some sample data from my table:
day_number daily_users_count
1 1
3 1
6 1
7 1
9 2
10 2
I need all day_number values, from 1 to max(day_number), and I want daily_users_count to be zero if it isn't mentioned in this table.
It should look something like this:
day_number daily_users_count
1 1
2 0
3 1
4 0
5 0
6 1
7 1
8 0
9 2
10 2
I think a left join with a table which has a number column with all integers from 1 to max(day_number) would work, if I put a default value for daily_users_count as 0.
What I don't get is how to create such a table where all integers within a certain range are present. Any alternate solutions or any ways to do this would be much appreciated.
You can do it with a recursive CTE which will return all the day_numbers including the missing ones and then a LEFT join to the table:
with cte as (
select min(day_number) day_number from tablename
union all
select day_number + 1 from cte
where day_number < (select max(day_number) from tablename)
)
select c.day_number,
coalesce(t.daily_users_count, 0) daily_users_count
from cte c left join tablename t
on t.day_number = c.day_number
See the demo.
Results:
| day_number | daily_users_count |
| ---------- | ----------------- |
| 1 | 1 |
| 2 | 0 |
| 3 | 1 |
| 4 | 0 |
| 5 | 0 |
| 6 | 1 |
| 7 | 1 |
| 8 | 0 |
| 9 | 2 |
| 10 | 2 |

Recursive join with SUM

I have data in the following format:
FromStateID ToStateID Seconds
1 2 10
2 3 20
3 4 15
4 5 5
I need the following output
FromStateID ToStateID Seconds
1 2 10
2 3 20
3 4 15
4 5 5
1 3 10+20
1 4 10+20+15
1 5 10+20+15+5
2 4 20+15
2 5 20+15+5
3 5 15+5
This output shows the total time taken FromStateId to ToStateId in every combination in chronological order.
Please help.
I think this is a recursive CTE that follows the links:
with cte as (
select FromStateID, ToStateID, Seconds
from t
union all
select cte.FromStateId, t.ToStateId, cte.Seconds + t.Seconds
from cte join
t
on cte.toStateId = t.FromStateId
)
select *
from cte;
Here is a db<>fiddle.
#Gordon LinOff is the better solution. Below is another option to achieve the same.
You can achieve this using CROSS JOIN and GROUP BY
DECLARE #table table(FromStateId int, ToStateId int, seconds int)
insert into #table
values
(1 ,2 ,10),
(2 ,3 ,20),
(3 ,4 ,15),
(4 ,5 ,5 );
;with cte_fromToCombination as
(select f.fromStateId, t.tostateId
from
(select distinct fromStateId from #table) as f
cross join
(select distinct toStateId from #table) as t
)
select c.FromStateId, c.ToStateId, t.sumseconds as Total_seconds
from cte_fromToCombination as c
CROSS APPLY
(SELECT sum(t.seconds)
from
#table as t
WHERE t.ToStateId <= c.ToStateId
) as t(sumseconds)
where c.tostateId > c.fromStateId
order by FromStateId,ToStateId
+-------------+-----------+---------------+
| FromStateId | ToStateId | Total_seconds |
+-------------+-----------+---------------+
| 1 | 2 | 10 |
| 1 | 3 | 30 |
| 1 | 4 | 45 |
| 1 | 5 | 50 |
| 2 | 3 | 30 |
| 2 | 4 | 45 |
| 2 | 5 | 50 |
| 3 | 4 | 45 |
| 3 | 5 | 50 |
| 4 | 5 | 50 |
+-------------+-----------+---------------+

Get users who took ride for 3 or more consecutive dates

I have below table, it shows user_id and ride_date.
+---------+------------+
| user_id | ride_date |
+---------+------------+
| 1 | 2019-11-01 |
| 1 | 2019-11-03 |
| 1 | 2019-11-05 |
| 2 | 2019-11-03 |
| 2 | 2019-11-04 |
| 2 | 2019-11-05 |
| 2 | 2019-11-06 |
| 3 | 2019-11-03 |
| 3 | 2019-11-04 |
| 3 | 2019-11-05 |
| 3 | 2019-11-06 |
| 4 | 2019-11-05 |
| 4 | 2019-11-07 |
| 4 | 2019-11-08 |
| 4 | 2019-11-09 |
| 5 | 2019-11-11 |
| 5 | 2019-11-13 |
+---------+------------+
I want user_id who took rides for 3 or more consecutive days along with days on which they took consecutive rides
The desired result is as below
+---------+-----------------------+
| user_id | consecutive_ride_date |
+---------+-----------------------+
| 2 | 2019-11-03 |
| 2 | 2019-11-04 |
| 2 | 2019-11-05 |
| 2 | 2019-11-06 |
| 3 | 2019-11-03 |
| 3 | 2019-11-04 |
| 3 | 2019-11-05 |
| 3 | 2019-11-06 |
| 4 | 2019-11-08 |
| 4 | 2019-11-09 |
| 4 | 2019-11-10 |
+---------+-----------------------+
SQL Fiddle
With LAG() and LEAD() window functions:
with cte as (
select *,
datediff(
day,
lag([ride_date]) over (partition by [user_id] order by [ride_date]),
[ride_date]
) prev1,
datediff(
day,
lag([ride_date], 2) over (partition by [user_id] order by [ride_date]),
[ride_date]
) prev2,
datediff(
day,
[ride_date],
lead([ride_date]) over (partition by [user_id] order by [ride_date])
) next1,
datediff(
day,
[ride_date],
lead([ride_date], 2) over (partition by [user_id] order by [ride_date])
) next2
from Table1
)
select [user_id], [ride_date]
from cte
where
(prev1 = 1 and prev2 = 2) or
(prev1 = 1 and next1 = 1) or
(next1 = 1 and next2 = 2)
See the demo.
Results:
> user_id | ride_date
> ------: | :---------
> 2 | 03/11/2019
> 2 | 04/11/2019
> 2 | 05/11/2019
> 2 | 06/11/2019
> 3 | 03/11/2019
> 3 | 04/11/2019
> 3 | 05/11/2019
> 3 | 06/11/2019
> 4 | 07/11/2019
> 4 | 08/11/2019
> 4 | 09/11/2019
Here is one way to adress this gaps-and-island problem:
first, assign a rank to each user ride with row_number(), and recover the previous ride_date (aliased lag_ride_date)
then, compare the date of the previous ride to the current one in a conditional sum, that increases when the dates are successive ; by comparing this with the rank of the user ride, you get groups (aliased grp) that represent consecutive rides with a 1 day spacing
do a window count how many records belong to each group (aliased cnt)
filter on records whose window count is greater than 3
Query:
select user_id, ride_date
from (
select
t.*,
count(*) over(partition by user_id, grp) cnt
from (
select
t.*,
rn1
- sum(case when ride_date = dateadd(day, 1, lag_ride_date) then 1 else 0 end)
over(partition by user_id order by ride_date) grp
from (
select
t.*,
row_number() over(partition by user_id order by ride_date) rn1,
lag(ride_date) over(partition by user_id order by ride_date) lag_ride_date
from Table1 t
) t
) t
) t
where cnt >= 3
Demo on DB Fiddle
This is a typical gaps and island problems.
We can solve it as follows
with data
as (
select user_id
,ride_date
,dateadd(day
,-row_number() over(partition by user_id order by ride_date asc)
,ride_date) as grp_field
from Table1
)
,consecutive_days
as(
select user_id
,ride_date
,count(*) over(partition by user_id,grp_field) as cnt
from data
)
select *
from consecutive_days
where cnt>=3
order by user_id,ride_date
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=7bb851d9a12966b54afb4d8b144f3d46
There is no need to apply gaps-and-islands methodologies to this problem. The problem is much simpler to solve.
You can return the users and first date just by using LEAD():
SELECT t1.*
FROM (SELECT t1.*,
LEAD(ride_date, 2) OVER (PARTITION BY user_id ORDER BY ride_date) as ride_date_2
FROM table1 t1
) t1
WHERE ride_date_2 = DATEADD(day, 2, ride_date);
If you want the actual dates, you can unpivot the results:
SELECT DISTINCT t1.user_id, v.ride_date
FROM (SELECT t1.*,
LEAD(ride_date, 2) OVER (PARTITION BY user_id ORDER BY ride_date) as ride_date_2
FROM table1 t1
) t1 CROSS APPLY
(VALUES (t1.ride_date),
(DATEADD(day, 1, t1.ride_date)),
(DATEADD(day, 2, t1.ride_date))
) v(ride_date)
WHERE t1.ride_date_2 = DATEADD(day, 2, t1.ride_date)
ORDER BY t1.user_id, v.ride_date;

Order rows by ntile and row_number

I'm trying to build stored procedure that will return data for Crystal Reports report.
Inside CR I'm using multi column layout.
I want to get 3 layout column something like this:
1 5 8
2 6 9
3 7 10
4
But because CR has some layout issues it is ordering my table like this:
1 2 3
4 5 6
7 8 9
10
So I've tried to create procedure that will return extra column on which I'll sort my data.
So instead 1,2,3,4 order I need 1,4,7,10,2,5,8,3,6,9...
I have table with that data:
ID | CASE_ID | CASE_DATE
--------------------------
1 | 1 | 2014-02-03
2 | 1 | 2014-02-04
3 | 1 | 2014-02-05
4 | 1 | 2014-02-06
5 | 1 | 2014-02-07
6 | 1 | 2014-02-08
7 | 1 | 2014-02-09
8 | 1 | 2014-02-10
9 | 1 | 2014-02-11
10 | 1 | 2014-02-12
AND I need stored procedure that will return this data:
ID | CASE_ID | CASE_DATE | ORDER
---------------------------------
1 | 1 | 2014-02-03 | 1
2 | 1 | 2014-02-04 | 5
3 | 1 | 2014-02-05 | 8
4 | 1 | 2014-02-06 | 2
5 | 1 | 2014-02-07 | 6
6 | 1 | 2014-02-08 | 9
7 | 1 | 2014-02-09 | 3
8 | 1 | 2014-02-10 | 7
9 | 1 | 2014-02-11 | 10
10 | 1 | 2014-02-12 | 4
Here is sql fiddle with sample data and my code: http://sqlfiddle.com/#!3/c24c1/1
Idea behind sort column:
divide all rows into 3 groups (ntile), take first item from first group, then first from second and first from third group
EDIT:
Here is my temporary solution, I hope that running this will clarify what I had in mind when I was asking this question:
--DECLARE #NUM INT;
--SET #NUM=3;
SELECT ID,
CASE_ID,
CONVERT(NVARCHAR(10),CASE_DATE,121) AS DATA,
(ROW1 - 1) * 3/*#NUM*/ + COL AS [ORDER]
FROM
( SELECT CASE_ID,
ID,
ROW AS LP,
COL,
ROW_NUMBER() OVER (PARTITION BY CASE_ID, COL ORDER BY ROW) AS ROW1,
CASE_DATE
FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY D.CASE_ID ORDER BY D.ID) AS ROW,
NTILE(3/*#NUM*/) OVER (PARTITION BY D.CASE_ID ORDER BY D.ID) AS COL,
ID,
D.CASE_ID,
CASE_DATE
FROM DATA D
WHERE D.CASE_ID = 1)X )Y
ORDER BY Y.CASE_ID,
LP
Edit: It looks like you actually want the ORDER column, not just returning the columns in that order.
SELECT ID,
CASE_ID,
DATA,
ROW_NUMBER() OVER (ORDER BY ROW, N) AS [ORDER]
FROM (
SELECT ID,
CASE_ID,
N,
ROW_NUMBER() OVER (PARTITION BY CASE_ID, N ORDER BY ID) AS ROW,
DATA
FROM (
SELECT
ID,
CASE_ID,
NTILE(3) OVER (PARTITION BY CASE_ID ORDER BY ID) AS N,
CONVERT(NVARCHAR(10), CASE_DATE,121) AS DATA
FROM DATA
WHERE CASE_ID = 1 ) X ) Y
ORDER BY ID;
SQLFiddle