Group rows by condition - sql

I have this data:
Start End Quantity
425 449 24
450 474 24
475 499 24
500 524 24
2300 2324 24
2400 2499 99
2500 2599 99
2800 2899 99
2900 2999 99
3200 3249 49
3250 3299 49
3300 3349 49
3350 3399 49
3400 3449 49
3500 3549 49
3600 3624 24
3650 3674 24
3700 3724 24
3950 3964 14
4000 4000 0
4150 4399 249
4400 4499 99
5034 5075 41
Quantity is a result of End - Start.
I would like to obtain the following data, the Generated rows:
Start End Quantity
425 449 24
450 474 24
475 499 24
500 524 24
425 524 96
2300 2324 24
2300 2324 24
2400 2499 99
2500 2599 99
-----GENERATED----
425 2599 438
------------------
2800 2899 99
2900 2999 99
3200 3249 49
3250 3299 49
3300 3349 49
3350 3399 49
3400 3449 49
3500 3549 49
-----GENERATED-----
2800 3549 492
------------------
3600 3624 24
3650 3674 24
3700 3724 24
3950 3964 14
4000 4000 0
4150 4399 249
4400 4499 99
5034 5075 41
-----GENERATED-----
3600 5075 475
------------------
The condition is that it has to sum all the quantities until 500. If it passes 500 do a new count.
I have tried with Rollup but I couldnt find the right condition to make it work.
Of course, this is way easier to do by programming code instead of SQL, but we must do it in database environment. The tools to get the generated rows can be anything, looping functions, new tables etc.
Error solving
I got into an error while running #Prdp's query:
Msg 530, Level 16, State 1, Line 1
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.
I found the solution here:
http://sqlhints.com/tag/the-statement-terminated-the-maximum-recursion-100-has-been-exhausted-before-statement-completion/
Update 1
Using #Prdp's query we got the following:
Start End rn st
(400) 424 1 24
425 449 2 48
450 474 3 72
475 499 4 96
500 524 5 120
2300 2324 6 144
2400 2499 7 243
2500 2599 8 342
2800 (2899) 9 (441)
(2900) 2999 10 99
3200 3249 11 148
3250 3299 12 197
3300 3349 13 246
3350 3399 14 295
3400 3449 15 344
3500 3549 16 393
3600 3624 17 417
3650 3674 18 441
3700 3724 19 465
3950 3964 20 479
4000 (4000) 21 (479)
(4150) 4399 22 249
4400 4499 23 348
5034 (5075) 24 (389)
Its getting closer to what we need. Would it be possible to extract only the data in between ( and ) while discarding the other data?
We can use cursors too.

You can use Recursive CTE. I can't think of any better way.
;WITH cte
AS (SELECT *,
Row_number()OVER(ORDER BY start) rn
FROM Yourtable),
rec_cte
AS (SELECT *,
( [End] - Start ) AS st,
1 AS grp
FROM cte
WHERE rn = 1
UNION ALL
SELECT a.*,
CASE
WHEN st + ( a.[End] - a.Start ) >= 500 THEN a.[End] - a.Start
ELSE st + ( a.[End] - a.Start )
END,
CASE
WHEN st + ( a.[End] - a.Start ) >= 500 THEN b.grp + 1
ELSE grp
END
FROM cte a
JOIN rec_cte b
ON a.rn = b.rn + 1)
SELECT Min(Start) as Start,
Max([End]) as [End],
Max(st) as Quantity
FROM rec_cte
GROUP BY grp
OPTION (maxrecursion 0)

Here is a proposed solution in MySQL. A similar strategy should work in SQL Server.
drop table if exists TestData;
create table TestData(Start int, End int, Quantity int);
insert TestData values (425,449,24);
insert TestData values (450,474,24);
insert TestData values (475,499,24);
insert TestData values (500,524,24);
insert TestData values (2300,2324,24);
insert TestData values (2400,2499,99);
insert TestData values (2500,2599,99);
insert TestData values (2800,2899,99);
insert TestData values (2900,2999,99);
insert TestData values (3200,3249,49);
insert TestData values (3250,3299,49);
insert TestData values (3300,3349,49);
insert TestData values (3350,3399,49);
insert TestData values (3400,3449,49);
insert TestData values (3500,3549,49);
insert TestData values (3600,3624,24);
insert TestData values (3650,3674,24);
insert TestData values (3700,3724,24);
insert TestData values (3950,3964,14);
insert TestData values (4000,4000,0);
insert TestData values (4150,4399,249);
insert TestData values (4400,4499,99);
insert TestData values (5034,5075,41);
drop table if exists DataRange;
create table DataRange (StartRange int, EndRange int);
insert DataRange values (425, 2599);
insert DataRange values (2800,3549);
insert DataRange values (3600,5075);
select
DataRange.StartRange,DataRange.EndRange
,sum(TestData.quantity) as Quantity
from TestData
inner join DataRange on
(TestData.start between DataRange.StartRange and DataRange.EndRange )
or
(TestData.End between DataRange.StartRange and DataRange.EndRange )
group by DataRange.StartRange,DataRange.EndRange

Related

Summing column that is grouped - SQL

I have a query:
SELECT
date,
COUNT(o.row_number)FILTER (WHERE o.row_number > 1 AND date_ddr IS NOT NULL AND telephone_number <> 'Anonymous' ) repeat_calls_24h
(
SELECT
telephone_number,
date_ddr,
ROW_NUMBER() OVER(PARTITION BY ddr.telephone_number ORDER BY ddr.date) row_number,
FROM
table_a
)o
GROUP BY 1
Generating the following table:
date
Repeat calls_24h
17/09/2022
182
18/09/2022
381
19/09/2022
81
20/09/2022
24
21/09/2022
91
22/09/2022
110
23/09/2022
231
What can I add to my query to provide a sum of the previous three days as below?:
date
Repeat calls_24h
Repeat Calls 3d
17/09/2022
182
18/09/2022
381
19/09/2022
81
644
20/09/2022
24
486
21/09/2022
91
196
22/09/2022
110
225
23/09/2022
231
432
Thanks
We can do it using lag.
select "date"
,"Repeat calls_24h"
,"Repeat calls_24h" + lag("Repeat calls_24h") over(order by "date") + lag("Repeat calls_24h", 2) over(order by "date") as "Repeat Calls 3d"
from t
date
Repeat calls_24h
Repeat Calls 3d
2022-09-17
182
null
2022-09-18
381
null
2022-09-19
81
644
2022-09-20
24
486
2022-09-21
91
196
2022-09-22
110
225
2022-09-23
231
432
Fiddle

Get nearest date column value from another table in SQL Server

I have two tables A and B,
Table A
PstngDate WorkingDayOutput
12/1/2020 221
12/3/2020 327
12/4/2020 509
12/5/2020 418
12/7/2020 390
12/8/2020 431
12/9/2020 244
12/10/2020 246
12/11/2020 314
12/12/2020 301
12/14/2020 411
12/15/2020 530
12/16/2020 554
12/17/2020 300
12/18/2020 375
12/23/2020 402
12/24/2020 302
12/25/2020 269
12/26/2020 382
12/28/2020 608
Table B
PstngDate HolidayOutput isWorkingDay
12/2/2020 20 0
12/6/2020 24 0
12/13/2020 31 0
12/19/2020 82 0
12/22/2020 507 0
12/27/2020 537 0
Expected output:
PstngDate WorkingDayOutput HolidayOutput
12/1/2020 221 20
12/3/2020 327
12/4/2020 509
12/5/2020 418 24
12/7/2020 390
12/8/2020 431
12/9/2020 244
12/10/2020 246
12/11/2020 314
12/12/2020 301 31
12/14/2020 411
12/15/2020 530
12/16/2020 554
12/17/2020 300
12/18/2020 375 589
12/23/2020 402
12/24/2020 302
12/25/2020 269
12/26/2020 382 537
12/28/2020 608
I want to join TableB to TableA with nearest lesser date column. If you see Expectedoutput table, day 18 row of holidayoutput column is taking sum of day19 and day22 of table B.
I want to join TableB to TableA with nearest lesser date column
This sounds like a lateral join:
select a.*, coalesce(b.holidayquantity, 0) as holidayquantity
from a
outer apply (
select top (1) b.*
from b
where b.pstng_date >= a.pstng_date
order by b.pstng_date
) b
You can use self left join as follows:
Select pstng_date, workingDayQuantity,
HolidayQuantity,
workingDayQuantity + HolidayQuantity as total
From
(Select a.*, b.HolidayQuantity,
Row_number() over (partirion by a.psrng_date order by b.pstng_date) ad rn
From tablea a join tableb b On b.pstng_date > a.pstng_date) t
Where rn=1

SQL Server Obtain Pairs of records

I am trying to obtain "pairs" of records and I just cant figure out.
Here is what I have:
Id TruckId LocationId MaterialMode
145223 1198 19 43
145224 1199 19 43
145225 1200 19 43
145226 1198 20 43
145227 1199 20 43
145228 1200 20 43
145229 1199 21 46
145230 1198 21 46
145231 1200 21 46
145232 1198 22 46
145233 1199 22 46
145234 1200 22 46
145235 1198 19 43
145236 1199 19 43
145237 1200 19 43
145238 1198 20 43
145239 1199 20 43
145240 1200 20 43
145241 1199 21 46
145242 1198 21 46
145243 1200 21 46
145244 1198 22 46
145245 1199 22 46
145246 1200 22 46
I need to get the following:
Id A Id B
145223 145226
145224 145227
145225 145228
145229 145233
145230 145232
145231 145234
145235 145238
145236 145239
145237 145240
145241 145245
145242 145244
145243 145246
Basically matching a TruckId between 2 locations under the same material mode
I have tried:
SELECT
Id AS IdA,
Lead(Id, 1, NULL) OVER(PARTITION BY TruckId, MaterialMode ORDER BY Date) AS IdB
FROM T
This produces:
Id A Id B
145223 145226
145224 145227
145225 145228
*145226 145235
*145227 145236
*145228 145237
145229 145233
145230 145232
145231 145234
*145232 145242
*145233 145241
*145234 145243
145235 145238
145236 145239
145237 145240
145241 145245
145242 145244
145243 145246
Records with the * I don't want them. If a pair is matched then that record should not be part of "another match"
I believe I understand your problem and below is a solution.
Explanation: I sorted the data rows into start and end points sets like in gap and islands problems and then joined a start id with end id for same material mode and truck.
; with separationSet as
(
select
*,
dense_rank()
over(
partition by materialmode,truckid
order by locationid asc
) as r
from T
)
, scoredSet as
(
select
*,
row_number()
over(
partition by materialmode,truckid,r
order by id
) as r2
from separationSet
)
, startToEndPairs as
(
select
S.id as StartId,
E.id as EndId
from scoredSet S
join scoredSet E
on S.r=1 and E.r=2
and S.r2=E.r2
and S.TruckId=E.TruckId
and S.materialmode=E.materialmode
)
select
*
from starttoEndPairs
order by StartId asc
See working demo

how to give different row id to sub group in in a group in sql query?

i have bunch of discount scheme for my item table , and for each item i have different discount scheme. now i want to give row id to that item but it should be start from zer0(0) for each item group, and when it got different DiscountId then it should be change, my table is in below image..
now for an example, for ItemCode 429 there are 7 same discount with DiscountId 427 so for this all i want row Id 0(zero) but when change DiscountId, it means for Same ItemCode and 428 DiscountId, then i want another RowId with increment. and when ItemCode change then rowId should be start from Zero(0).
can anyone help me please??
my current query is simpaly "select * from ItemDiscount_md".
Maybe something like this:
Test data:
DECLARE #tbl TABLE(ITEMCode INT,DiscountId INT)
INSERT INTO #tbl
VALUES
(73,419),(73,419),(73,420),(73,420),(73,420),
(429,427),(429,427),(429,427),(429,427),(429,427),
(429,427),(429,427),(429,427),(429,428),(429,428)
Query:
;WITH CTE
AS
(
SELECT
DENSE_RANK() OVER(PARTITION BY tbl.ITEMCode
ORDER BY DiscountId) AS Rownbr,
tbl.*
FROM
#tbl AS tbl
)
SELECT
CTE.Rownbr-1 AS RowNbr,
CTE.DiscountId,
CTE.ITEMCode
FROM
CTE
Of course you can simplify the query by writing this:
SELECT
(DENSE_RANK() OVER(PARTITION BY tbl.ITEMCode
ORDER BY DiscountId))-1 AS Rownbr,
tbl.*
FROM
#tbl AS tbl
I just thought it was nicer and more readable with a CTE function
References:
DENSE_RANK
OVER Clause
Using Common Table Expressions
ROW_NUMBER
EDIT
To answer the comment. No ROW_NUMBER will not return the same counter. This is the output with DENSE_RANK:
0 419 73
0 419 73
1 420 73
1 420 73
1 420 73
0 427 429
0 427 429
0 427 429
0 427 429
0 427 429
0 427 429
0 427 429
0 427 429
1 428 429
1 428 429
And this is with ROW_NUMBER:
0 419 73
1 419 73
2 420 73
3 420 73
4 420 73
0 427 429
1 427 429
2 427 429
3 427 429
4 427 429
5 427 429
6 427 429
7 427 429
8 428 429
9 428 429
As you see ROW_NUMBER() recounts the group when the DENSE_RANK ranks the group
Just more simplified Arion's Answer
DECLARE #tbl TABLE(ITEMCode INT,DiscountId INT)
INSERT INTO #tbl
VALUES
(73,419),
(73,419),
(73,420),
(73,420),
(73,420),
(429,427),
(429,427),
(429,427),
(429,427),
(429,427),
(429,427),
(429,427),
(429,427),
(429,428),
(429,428)
;
SELECT
(DENSE_RANK() OVER(PARTITION BY ITEMCode ORDER BY DiscountId) -1) AS Rownbr,
DiscountId,
ITEMCode
FROM
#tbl
if you got data like this: (from the image) #temp table
itemcode DiscountId DayId
----------- ----------- -----------
102 416 2
102 416 3
102 416 4
79 419 3
79 419 1
79 420 2
79 420 1
use row_number() to get below result
itemcode DiscountId DayId rowid
----------- ----------- ----------- --------------------
102 416 2 1
102 416 3 2
102 416 4 3
79 419 3 1
79 419 1 2
79 420 2 1
79 420 1 2
SQL example:
select itemcode, DiscountId, DayId
, ROW_NUMBER() over (partition by Discountid order by discountid) as 'rowid'
from #temp

HSQLDB query to replace a null value with a value derived from another record

This is a small excerpt from a much larger table, call it LOG:
RN EID FID FRID TID TFAID
1 364 509 7045 null 7452
2 364 509 7045 7452 null
3 364 509 7045 7457 null
4 375 512 4525 5442 5241
5 375 513 4525 5863 5241
6 375 515 4525 2542 5241
7 576 621 5632 null 5452
8 576 621 5632 2595 null
9 672 622 5632 null 5966
10 672 622 5632 2635 null
I would like a query that will replace the null in the 'TFAID' column with the value from the 'TFAID' column from the 'FID' column that matches.
Desired output would therefore be:
RN EID FID FRID TID TFAID
1 364 509 7045 null 7452
2 364 509 7045 7452 7452
3 364 509 7045 7457 7452
4 375 512 4525 5442 5241
5 375 513 4525 5863 5241
6 375 515 4525 2542 5241
7 576 621 5632 null 5452
8 576 621 5632 2595 5452
9 672 622 5632 null 5966
10 672 622 5632 2635 5966
I know that something like
SELECT RN,
EID,
FID,
FRID,
TID,
(COALESCE TFAID, {insert clever code here}) AS TFAID
FROM LOG
is what I need, but I can't for the life of me come up with the clever bit of SQL that will fill in the proper TFAID.
HSQLDB supports SQL features that can be used as alternatives. These features are not supported by some other databases.
CREATE TABLE LOG (RN INT, EID INT, FID INT, FRID INT, TID INT, TFAID INT);
-- using LATERAL
SELECT l.RN, l.EID, l.FID, l.FRID, l.TID,
COALESCE(l.TFAID, f.TFAID) AS TFAID
FROM LOG l , LATERAL (SELECT MAX(TFAID) AS TFAID FROM LOG f WHERE f.FID = l.FID) f
-- using scalar subquery
SELECT l.RN, l.EID, l.FID, l.FRID, l.TID,
COALESCE(l.TFAID, (SELECT MAX(TFAID) AS TFAID FROM LOG f WHERE f.FID = l.FID)) AS TFAID
FROM LOG l
Here is one approach. This aggregates the log to get the value and then joins the result in:
SELECT l.RN, l.EID, l.FID, l.FRID, l.TID,
COALESCE(l.TFAID, f.TFAID) AS TFAID
FROM LOG l join
(select fid, max(tfaid) as tfaid
from log
group by fid
) f
on l.fid = f.fid;
There may be other approaches that are more efficient. However, HSQL doesn't implement all SQL features.