count :SQL :DB2 - sql

I have this table:
CODE
IDNR
NAME
LIMIT
123
80
XXX
2019-05
123
81
XXX
2019-10
124
80
YYY
2019-01
125
80
ZZZ
2019-05
125
81
ZZZ
2019-06
125
80
ZZZ
2019-07
126
80
III
2019-05
126
80
III
2019-09
126
80
III
2019-07
I want to have a new column (Count-LIMIT ) contain how many LIMIT per code, and another contain YES if the limit are continuous and No if not.
MY RESULT that I want like:
CODE
IDNR
NAME
LIMIT
Count-Limit
CON
123
80
XXX
2019-05
2
NO
123
81
XXX
2019-10
2
NO
124
80
YYY
2019-01
1
NO
125
80
ZZZ
2019-05
3
YES
125
81
ZZZ
2019-06
3
YES
125
80
ZZZ
2019-07
3
YES
126
80
III
2019-05
3
NO
126
80
III
2019-09
3
NO
126
80
III
2019-07
3
NO
THANKS!

Try this:
WITH T (CODE, IDNR, NAME, LIMIT) AS
(
VALUES
(123, 80, 'XXX', '2019-05')
, (123, 81, 'XXX', '2019-10')
, (124, 80, 'YYY', '2019-01')
, (125, 80, 'ZZZ', '2019-05')
, (125, 81, 'ZZZ', '2019-06')
, (125, 80, 'ZZZ', '2019-07')
, (126, 80, 'III', '2019-05')
, (126, 80, 'III', '2019-09')
, (126, 80, 'III', '2019-07')
, (128, 80, 'AAA', '2021-01')
, (128, 80, 'AAA', '2021-03')
, (128, 80, 'AAA', '2021-05')
, (128, 80, 'AAA', '2021-07')
, (128, 80, 'AAA', '2021-08')
, (128, 80, 'AAA', '2021-09')
)
SELECT
T.*
, COUNT (1) OVER (PARTITION BY CODE) AS COUNT_LIMIT
, CASE
WHEN TO_DATE (LIMIT || '-01', 'YYYY-MM-DD') IN
(
LAG (TO_DATE (LIMIT || '-01', 'YYYY-MM-DD')) OVER (PARTITION BY CODE ORDER BY LIMIT) + 1 MONTH
, LEAD (TO_DATE (LIMIT || '-01', 'YYYY-MM-DD')) OVER (PARTITION BY CODE ORDER BY LIMIT) - 1 MONTH
)
THEN 'YES'
ELSE 'NO'
END AS CON
FROM T
ORDER BY CODE, LIMIT
The result is:
CODE
IDNR
NAME
LIMIT
COUNT_LIMIT
CON
123
80
XXX
2019-05
2
NO
123
81
XXX
2019-10
2
NO
124
80
YYY
2019-01
1
NO
125
80
ZZZ
2019-05
3
YES
125
81
ZZZ
2019-06
3
YES
125
80
ZZZ
2019-07
3
YES
126
80
III
2019-05
3
NO
126
80
III
2019-07
3
NO
126
80
III
2019-09
3
NO
128
80
AAA
2021-01
6
NO
128
80
AAA
2021-03
6
NO
128
80
AAA
2021-05
6
NO
128
80
AAA
2021-07
6
YES
128
80
AAA
2021-08
6
YES
128
80
AAA
2021-09
6
YES

Related

SQL to find related rows in Loop in ANSI SQL or Snowflake SQL

I have a requirement where I need to link all related CUSTOMER ID and assign a Unified Cust ID to all the related Cust_id.
Ex: for below data,
INPUT DATA
PK_ID CUST_ID_1 CUST_ID_2 CUST_ID_3
1 123 456 567
2 898 567 780
3 999 780 111
4 111 222 333
Based on CUST_ID_1/CUST_ID_2/CUST_ID_3 need to link all the and assign a Unified ID to all the rows.
OUTPUT DATA
Unified ID CUST_ID_1 CUST_ID_2 CUST_ID_3
1000 123 456 567
1000 898 567 780
1000 999 780 111
1000 111 222 333
Trying to perform Self Join but it cannot be definite. Is there a function or ANSI SQL feature which can help in this?
What i have tried,
CREATE TEMP TBL_TEMP AS(
SELECT A.PK_ID
FROM TBL A
LEFT JOIN TBL B
ON A.CUST_ID_1=B.CUST_ID_1
AND A.PK_ID<>B.PK_ID)
UPDATE TBL
FROM TBL_TEMP
SET UNIFIED_ID=SEQ_UNIF_ID.nextval
WHERE TBL.PK_ID=TBL_TEMP.PK_ID
This update i have to write for each column and multiple times.
If you are ok with gap in sequences then following is what I can come up with as of now.
update cust_temp a
set unified_id = t.unified_id
from
(
select
case
when (select count(*) from cust_temp b
where arrays_overlap(array_construct(a.cust_id_1,a.cust_id_2,a.cust_id_3),
array_construct(b.cust_id_1,b.cust_id_2,b.cust_id_3)))>1 -- match across data-set
then 1000 -- same value for common rows
else
ts.nextval --- using sequence for non-common rows
end unified_id,
a.cust_id_1,a.cust_id_2,a.cust_id_3
from cust_temp a, table(getnextval(SEQ_UNIF_ID)) ts) t
where t.cust_id_1 = a.cust_id_1
and t.cust_id_2 = a.cust_id_2
and t.cust_id_3 = a.cust_id_3;
Updated data-set
select * from cust_temp;
UNIFIED_ID
CUST_ID_1
CUST_ID_2
CUST_ID_3
1000
123
456
567
1000
898
567
780
1000
111
222
333
20000
100
200
300
1000
999
780
111
1000
234
123
901
23000
260
360
460
24000
160
560
760
Original data set -
select * from cust_temp;
UNIFIED_ID
CUST_ID_1
CUST_ID_2
CUST_ID_3
NULL
123
456
567
NULL
898
567
780
NULL
111
222
333
NULL
100
200
300
NULL
999
780
111
NULL
234
123
901
NULL
260
360
460
NULL
160
560
760
Arrays_overlap logic is thanks to #Simeon.
Following procedure can be used -
EXECUTE IMMEDIATE $$
DECLARE
duplicate number;
x number;
BEGIN
duplicate := (select count(cnt) from (select a.unified_id,count(*) cnt from cust_temp a,
cust_temp b
where
arrays_overlap(array_construct(a.cust_id_1,a.cust_id_2,a.cust_id_3),
array_construct(b.cust_id_1,b.cust_id_2,b.cust_id_3))
AND a.cust_id_1 != b.cust_id_1
AND a.cust_id_2 != b.cust_id_2
AND a.cust_id_3 != b.cust_id_3
group by a.unified_id) where cnt>1
);
for x in 1 to duplicate do
update cust_temp a
set a.unified_id = (select min(b.unified_id) uid from cust_temp b
where arrays_overlap(array_construct(a.cust_id_1,a.cust_id_2,a.cust_id_3),
array_construct(b.cust_id_1,b.cust_id_2,b.cust_id_3)));
end for;
END;
$$
;
Which will produce following output dataset -
UNIFIED_ID
CUST_ID_1
CUST_ID_2
CUST_ID_3
1000
100
200
300
2000
123
456
567
2000
898
567
780
2000
111
222
333
2000
999
780
111
2000
234
123
901
7000
260
360
460
8000
160
560
760
8000
186
160
766
For an input data-set as -
UNIFIED_ID
CUST_ID_1
CUST_ID_2
CUST_ID_3
1000
100
200
300
2000
123
456
567
3000
898
567
780
4000
111
222
333
5000
999
780
111
6000
234
123
901
7000
260
360
460
8000
160
560
760
9000
186
160
766

Groupby each group and then do division of two divisions (current yr/latest yr for each column)

I'd like to create a new column by dividing current year by its latest year in Col_1 and Col_2 respectively for each group. Then, divide the two divisions.
Methodology: Calculate (EachYrCol_1/Yr2000Col_1)/(EachYrCol_2/Yr2000Col_2) for each group
See example below:
Year
Group
Col_1
Col_2
New Column
1995
A
100
11
(100/600)/(11/66)
1996
A
200
22
(200/600)/(22/66)
1997
A
300
33
(300/600)/(33/66)
1998
A
400
44
.............
1999
A
500
55
.............
2000
A
600
66
.............
1995
B
700
77
(700/1200)/(77/399)
1996
B
800
88
(800/1200)/(88/399)
1997
B
900
99
(900/1200)/(99/399)
1998
B
1000
199
.............
1999
B
1100
299
.............
2000
B
1200
399
.............
Sample dataset:
import pandas as pd
df = pd.DataFrame({'Year':[1995, 1996, 1997, 1998, 1999, 2000,1995, 1996, 1997, 1998, 1999, 2000],
'Group':['A', 'A', 'A','A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B'],
'Col_1':[100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200],
'Col_2':[11, 22, 33, 44, 55, 66, 77, 88, 99, 199, 299, 399]})
Use GroupBy.transform with GroupBy.last for helper DataFrame, so possible divide each column:
df1 = df.groupby('Group').transform('last')
df['New'] = df['Col_1'].div(df1['Col_1']).div(df['Col_2'].div(df1['Col_2']))
print (df)
Year Group Col_1 Col_2 New
0 1995 A 100 11 1.000000
1 1996 A 200 22 1.000000
2 1997 A 300 33 1.000000
3 1998 A 400 44 1.000000
4 1999 A 500 55 1.000000
5 2000 A 600 66 1.000000
6 1995 B 700 77 3.022727
7 1996 B 800 88 3.022727
8 1997 B 900 99 3.022727
9 1998 B 1000 199 1.670854
10 1999 B 1100 299 1.223244
11 2000 B 1200 399 1.000000

SQL Join by comparing measures or loop with cursors?

In order to verify if Deliveries are done on time, I need to match delivery Documents to PO schedule lines (SchLin) based on the comparison between Required Quantity (ReqQty) and Delivered Quantity (DlvQty).
The Delivery Docs have a reference to the PO and POItm but not to the SchLin.
Once a Delivery Doc is assigned to a Schedule Line I can calculate the Delivery Delta (DlvDelta) as the number of days it was delivered early or late compared to the requirement (ReqDate).
Examples of the two base tables are as follows:
Schedule lines
PO POItm SchLin ReqDate ReqQty
123 1 1 10/11 20
123 1 2 30/11 30
124 2 1 15/12 10
124 2 2 24/12 15
Delivery Docs
Doc Item PO POItm DlvDate DlvQty
810 1 123 1 29/10 12
816 1 123 1 02/11 07
823 1 123 1 04/11 13
828 1 123 1 06/11 08
856 1 123 1 10/11 05
873 1 123 1 14/11 09
902 1 124 2 27/11 05
908 1 124 2 30/11 07
911 1 124 2 08/12 08
923 1 124 2 27/12 09
Important: Schedule Lines and Deliveries should have the same PO and POItm.
The other logic to link is to sum the DlvQty until we reach (or exceed) ReqQty.
Those deliveries are then linked to the schedule line. Subsequent deliveries are used for the following schedule line(s). A delivery schould be matched to only one schedule line.
After comparing the ReqQty and DlvQty the assignments should result in following:
Result
Doc Item PO POItm Schlin ReqDate DlvDate DlvDelta
810 1 123 1 1 10/11 29/10 -11
816 1 123 1 1 10/11 02/11 -08
823 1 123 1 1 10/11 04/11 -06
828 1 123 1 2 30/11 06/11 -24
856 1 123 1 2 30/11 10/11 -20
873 1 123 1 2 30/11 14/11 -16
902 1 124 2 1 15/12 27/11 -18
908 1 124 2 1 15/12 30/11 -15
911 1 124 2 2 24/12 08/12 -16
923 1 124 2 2 24/12 27/12 +03
Up till now, I have done this with loops using cursors but performance is rather sluggish.
Is there another way in SQL (script) using e.g. joins by comparing measures to achieve the same result?
Regards,
Eric
If you can express the rule for matching a delivery with a schedule line, you can produce the results you want in a single query. And, yes, I promise it will be faster (and simpler) than executing the same logic in loops on cursors.
I can't reproduce your exact results because I don't quite understand how the two tables relate. Hopefully from the code below you'll be able to figure it out by adjusting the join criteria.
I don't have your DBMS. My code uses SQLite, which has its own peculiar date functions. You'll have to substitute the ones your system provides. In any event, I can't recommend 5-character strings for dates. Use a datetime type if you have one, and include 4-digit years regardless. Else how many days are there between Christmas and New Years Day?
create table S (
PO int not NULL,
POItm int not NULL,
SchLin int not NULL,
ReqDate char not NULL,
ReqQty int not NULL,
primary key (PO, POItm, SchLin)
);
insert into S values
(123, 1, 1, '10/11', 20 ),
(123, 1, 2, '30/11', 30 ),
(124, 2, 1, '15/12', 10 ),
(124, 2, 2, '24/12', 15 );
create table D (
Doc int not NULL,
Item int not NULL,
PO int not NULL,
POItm int not NULL,
DlvDate char not NULL,
DlvQty int not NULL,
primary key (Doc, Item)
);
insert into D values
(810, 1, 123, 1, '29/10', 12 ),
(816, 1, 123, 1, '02/11', 07 ),
(823, 1, 123, 1, '04/11', 13 ),
(828, 1, 123, 1, '06/11', 08 ),
(856, 1, 123, 1, '10/11', 05 ),
(873, 1, 123, 1, '14/11', 09 ),
(902, 1, 124, 2, '27/11', 05 ),
(908, 1, 124, 2, '30/11', 07 ),
(911, 1, 124, 2, '08/12', 08 ),
(923, 1, 124, 2, '27/12', 09 );
select D.Doc, D.Item, D.PO, S.SchLin, S.ReqDate, D.DlvDate
, cast(
julianday('2018-' || substr(DlvDate, 4,2) || '-' || substr(DlvDate, 1,2))
- julianday('2018-' || substr(ReqDate, 4,2) || '-' || substr(ReqDate, 1,2))
as int) as DlvDelta
from S join D on S.PO = D.PO and S.POItm = D.POItm
;
Result:
Doc Item PO SchLin ReqDate DlvDate DlvDelta
---------- ---------- ---------- ---------- ---------- ---------- ----------
810 1 123 1 10/11 29/10 -12
810 1 123 2 30/11 29/10 -32
816 1 123 1 10/11 02/11 -8
816 1 123 2 30/11 02/11 -28
823 1 123 1 10/11 04/11 -6
823 1 123 2 30/11 04/11 -26
828 1 123 1 10/11 06/11 -4
828 1 123 2 30/11 06/11 -24
856 1 123 1 10/11 10/11 0
856 1 123 2 30/11 10/11 -20
873 1 123 1 10/11 14/11 4
873 1 123 2 30/11 14/11 -16
902 1 124 1 15/12 27/11 -18
902 1 124 2 24/12 27/11 -27
908 1 124 1 15/12 30/11 -15
908 1 124 2 24/12 30/11 -24
911 1 124 1 15/12 08/12 -7
911 1 124 2 24/12 08/12 -16
923 1 124 1 15/12 27/12 12
923 1 124 2 24/12 27/12 3

Group rows by condition

I have this data:
Start End Quantity
425 449 24
450 474 24
475 499 24
500 524 24
2300 2324 24
2400 2499 99
2500 2599 99
2800 2899 99
2900 2999 99
3200 3249 49
3250 3299 49
3300 3349 49
3350 3399 49
3400 3449 49
3500 3549 49
3600 3624 24
3650 3674 24
3700 3724 24
3950 3964 14
4000 4000 0
4150 4399 249
4400 4499 99
5034 5075 41
Quantity is a result of End - Start.
I would like to obtain the following data, the Generated rows:
Start End Quantity
425 449 24
450 474 24
475 499 24
500 524 24
425 524 96
2300 2324 24
2300 2324 24
2400 2499 99
2500 2599 99
-----GENERATED----
425 2599 438
------------------
2800 2899 99
2900 2999 99
3200 3249 49
3250 3299 49
3300 3349 49
3350 3399 49
3400 3449 49
3500 3549 49
-----GENERATED-----
2800 3549 492
------------------
3600 3624 24
3650 3674 24
3700 3724 24
3950 3964 14
4000 4000 0
4150 4399 249
4400 4499 99
5034 5075 41
-----GENERATED-----
3600 5075 475
------------------
The condition is that it has to sum all the quantities until 500. If it passes 500 do a new count.
I have tried with Rollup but I couldnt find the right condition to make it work.
Of course, this is way easier to do by programming code instead of SQL, but we must do it in database environment. The tools to get the generated rows can be anything, looping functions, new tables etc.
Error solving
I got into an error while running #Prdp's query:
Msg 530, Level 16, State 1, Line 1
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.
I found the solution here:
http://sqlhints.com/tag/the-statement-terminated-the-maximum-recursion-100-has-been-exhausted-before-statement-completion/
Update 1
Using #Prdp's query we got the following:
Start End rn st
(400) 424 1 24
425 449 2 48
450 474 3 72
475 499 4 96
500 524 5 120
2300 2324 6 144
2400 2499 7 243
2500 2599 8 342
2800 (2899) 9 (441)
(2900) 2999 10 99
3200 3249 11 148
3250 3299 12 197
3300 3349 13 246
3350 3399 14 295
3400 3449 15 344
3500 3549 16 393
3600 3624 17 417
3650 3674 18 441
3700 3724 19 465
3950 3964 20 479
4000 (4000) 21 (479)
(4150) 4399 22 249
4400 4499 23 348
5034 (5075) 24 (389)
Its getting closer to what we need. Would it be possible to extract only the data in between ( and ) while discarding the other data?
We can use cursors too.
You can use Recursive CTE. I can't think of any better way.
;WITH cte
AS (SELECT *,
Row_number()OVER(ORDER BY start) rn
FROM Yourtable),
rec_cte
AS (SELECT *,
( [End] - Start ) AS st,
1 AS grp
FROM cte
WHERE rn = 1
UNION ALL
SELECT a.*,
CASE
WHEN st + ( a.[End] - a.Start ) >= 500 THEN a.[End] - a.Start
ELSE st + ( a.[End] - a.Start )
END,
CASE
WHEN st + ( a.[End] - a.Start ) >= 500 THEN b.grp + 1
ELSE grp
END
FROM cte a
JOIN rec_cte b
ON a.rn = b.rn + 1)
SELECT Min(Start) as Start,
Max([End]) as [End],
Max(st) as Quantity
FROM rec_cte
GROUP BY grp
OPTION (maxrecursion 0)
Here is a proposed solution in MySQL. A similar strategy should work in SQL Server.
drop table if exists TestData;
create table TestData(Start int, End int, Quantity int);
insert TestData values (425,449,24);
insert TestData values (450,474,24);
insert TestData values (475,499,24);
insert TestData values (500,524,24);
insert TestData values (2300,2324,24);
insert TestData values (2400,2499,99);
insert TestData values (2500,2599,99);
insert TestData values (2800,2899,99);
insert TestData values (2900,2999,99);
insert TestData values (3200,3249,49);
insert TestData values (3250,3299,49);
insert TestData values (3300,3349,49);
insert TestData values (3350,3399,49);
insert TestData values (3400,3449,49);
insert TestData values (3500,3549,49);
insert TestData values (3600,3624,24);
insert TestData values (3650,3674,24);
insert TestData values (3700,3724,24);
insert TestData values (3950,3964,14);
insert TestData values (4000,4000,0);
insert TestData values (4150,4399,249);
insert TestData values (4400,4499,99);
insert TestData values (5034,5075,41);
drop table if exists DataRange;
create table DataRange (StartRange int, EndRange int);
insert DataRange values (425, 2599);
insert DataRange values (2800,3549);
insert DataRange values (3600,5075);
select
DataRange.StartRange,DataRange.EndRange
,sum(TestData.quantity) as Quantity
from TestData
inner join DataRange on
(TestData.start between DataRange.StartRange and DataRange.EndRange )
or
(TestData.End between DataRange.StartRange and DataRange.EndRange )
group by DataRange.StartRange,DataRange.EndRange

How will i join/combine these two table to get one result

Table 1
Code Name1 type BalanceDue id1 id2 id3 id4 id5 emp
2600 intl-Airfare 1 2.38 120 410 510 603 7060513 null
1100 intl-travel 1 2.66 120 420 540 602 7060513 null
2400 intl-Meals 1 1.50 120 420 520 602 7060513 null
4100 Transpo 2 19.70 110 210 510 601 null
4100 Transpo 2 13.25 110 210 500 601 null
4100 Transpo 2 17.38 110 210 500 600 null
3600 Dom travel 3 25.11 110 210 500 600 55713 null
Table 2
Code Details Total type code1 code2 code3 code4 code5 emp
4100 no#233 Emp1-Parking 11.39 2 110 210 510 601 null null
4100 no#231 Jes-Parking 6.83 2 110 210 510 601 null null
4100 no#232 Jes-TransExp 1.48 2 110 210 510 601 null null
4100 no#234 Emp2-TollFee 0.23 2 110 210 500 601 null null
4100 no#239 Emp2-Parking 1.82 2 110 210 500 601 null null
4100 no#240 Emp3-Parking 2.96 2 110 210 500 601 null null
4100 no#252 Emp5-TollFee 8.24 2 110 210 500 601 null null
4100 no#235 Jay-TollFee 4.90 2 110 210 500 600 null null
4100 no#243 Jay-TransExp 12.48 2 110 210 500 600 null null
I want this as a result:
if type is 1 then display all values in table 1 which has type 1
same as if type is 3
if type is 2 then display all values in table 2 considering code1 to code5
Result
Code Details type Total code1 code2 code3 code4 code5 emp
2600 intl-Airfare 1 2.38 120 410 510 603 7060513 null
1100 intl-travel 1 2.66 120 420 540 602 7060513 null
2400 intl-Meals 1 1.50 120 420 520 602 7060513 null
4100 no#233 Emp1-Parking 2 11.39 110 210 510 601 null null
4100 no#231 Jes-Parking 2 6.83 110 210 510 601 null null
4100 no#232 Jes-TransExp 2 1.48 110 210 510 601 null null
4100 no#234 Emp2-TollFee 2 0.23 110 210 500 601 null null
4100 no#239 Emp2-Parking 2 1.82 110 210 500 601 null null
4100 no#240 Emp3-Parking 2 2.96 110 210 500 601 null null
4100 no#252 Emp5-TollFee 2 8.24 110 210 500 601 null null
4100 no#235 Jay-TollFee 2 4.90 110 210 500 600 null null
4100 no#243 Jay-TransExp 2 12.48 110 210 500 600 null null
3600 Dom travel 3 25.11 110 210 500 600 55713 null
Sometimes, id5 has a value and sometimes other columns were null.
sorry, i hope i explained it clearly :(
Thanks everyone!
Your desired results look like a UNION (The UNION clause lets your "merge" together more queries in a single, final result set) similar to this:
SELECT Code, Name1 as Details, type, BalanceDue as Total,id1 as code1, id2 as code2,
id3 as code3, id4 as code4, id5 as code5, emp
FROM Table1 WHERE type IN (1, 3) -- could also be WHERE type = 1 OR type = 3
UNION
SELECT Code, Details, type, Total, code1, code2, code3, code4, code5, emp
FROM Table2 WHERE type = 2