oracle sql running total range - sql

I have two tables tab_a as
SUB_ID AMOUNT
1 10
2 5
3 7
4 15
5 4
2 table tab_b as
slab_number slab_start slab_end
1 12 20
2 21 25
3 26 35
slab_start will always be 1 more than slab_end of previous slab number
If I run the running total for tab_a my result is
select sub_id , sum(amount) OVER(ORDER BY sub_id) run_sum
from tab_a
sub_id run_sum
1 10
2 15
3 22
4 37
5 41
I need to SQL query to check which slab_NUMBER if run_sum is less than first slab_number from then it should be Zero , if run_sum is more than last slab number then blank except the row which crosses the limit .
Expected result is
sub_id run_sum slab_number
1 10 0
2 15 1
3 22 2
4 37 3
5 41 NULL
I have tried this .
First find the running sum which crosses the limit i. e last slab_end
select min( run_sum )
from (select sub_id , sum(amount) OVER(ORDER BY sub_id) run_sum
from tab_a ) where run_sum>=35
then use below query
select sub_id,
run_sum,
case
when run_sum <
(select SLAB_START from tab_b where slab_number = '1') then
0
when run_sum = 37 then
(select max(slab_number) from tab_b)
when run_sum > 37 then
NULL
else
(select slab_number
from tab_b
where run_sum between SLAB_START and slab_end)
end slab_number
from (select sub_id, sum(amount) OVER(ORDER BY sub_id) run_sum from tab_a)
is there any other way to improve.

Somewhat strange requirement :) Use some analytic functions and case when's. Row_number when you need to find something first, max() over() and sum() over() when you need information from over rows:
with
a as (
select sub_id, row_number() over (order by sub_id) rn,
sum(amount) over (order by sub_id) rs
from tab_a),
b as (select tab_b.*, max(slab_number) over () msn from tab_b )
select sub_id, rs,
case when sn is null and row_number() over (partition by sn order by sub_id) = 1
then msn else sn
end sn
from (
select sub_id, rs, max(msn) over () msn,
case when slab_number is null and rn = 1 then 0 else slab_number end sn
from a left join b on rs between slab_start and slab_end)
dbfiddle demo

you could try this:
select a.sub_id , sum(a.amount) OVER(ORDER BY a.sub_id) run_sum
,case when b.slab_number=1 then 0 else lag(b.slab_number,1) over (order by a.sub_id)end slab_number
from tab_a a
left join tab_b b on a.SUB_ID = b.slab_number

I think this is basically a left join with a default value:
select a.*,
(case when a.run_sum < bb.min_slab_num then 0
else b.slab_num
end) as slab_num
from (select sub_id,
sum(amount) over (order by sub_id) as run_sum
from tab_a
) a left join
tab_b b
on a.run_sum between slab_start and slab_end cross join
(select min(slab_start) as min_slab_start
from tab_b
) bb;

Related

return row where column value changed from last change

I have a table and i want to know the minimum date since the last change grouped by 2 columns
in the data, I want to know the lates PartNumberID by location, with the min date since the last change.
*Expected row it's not part of the table
DATA:
Location
RecordAddedDate
PartNumberID
ExpectedRow
7
2022-06-23
1
I want this row
8
2022-06-23
1
I want this row
8
2022-06-24
1
8
2022-06-25
1
9
2022-06-23
1
I want this row
15
2022-06-23
1
15
2022-06-24
1
15
2022-06-25
2
15
2022-06-26
1
I want this row
15
2022-06-27
1
Expected output:
Location
RecordAddedDate
PartNumberID
7
2022-06-23
1
8
2022-06-23
1
9
2022-06-23
1
15
2022-06-26
1
I'm on sql
I have tried with but I dont know how to stop when the value change
with cte as (
select t.LocationID, t.RecordAddedDate, t.PartNumberID
FROM mytable t
INNER JOIN (select PL.LocationID, PL.RecordAddedDate, PL.PartNumberID
FROM mytable PL INNER JOIN
(SELECT PSCc.LocationID, MAX(PSCc.RecordAddedDate) AS DateSetup
FROM mytable PSCc
WHERE PSCc.RecordDeleted = 0
GROUP BY PSCc.LocationID) AS PSCc ON PSCc.LocationID = PL.LocationID AND PSCc.DateSetup = RecordAddedDate) as tt on t.RecordAddedDate<=tt.RecordAddedDate and t.LocationID= tt.LocationID and t.PartNumberID= tt.PartNumberID
)
select *
from cte c
where not exists(
select 1 from cte
where cte.LocationID = c.LocationID
and cte.PartNumberID=c.PartNumberID
and cte.RecordAddedDate<c.RecordAddedDate
)
order by LocationID,RecordAddedDate
Thank you
use lag() to find the last change (order by RecordAddedDate desc) in PartNumberID.
cumulative sum sum(isChange) to group the related rows under same group no. grp = 0 with be the rows of the last change
To get the min - RecordAddedDate, use row_number()
with
cte1 as
(
select *,
isChange = case when PartNumberID
= isnull(lag(PartNumberID) over (partition by Location
order by RecordAddedDate desc),
PartNumberID)
then 0
else 1
end
from mytable
),
cte2 as
(
select *, grp = sum(isChange) over (partition by Location order by RecordAddedDate desc)
from cte1
),
cte3 as
(
select *, rn = row_number() over (partition by Location order by RecordAddedDate)
from cte2 t
where t.grp = 0
)
select *
from cte3 t
where t.rn = 1
db<>fiddle demo

Assign column value based on the percentage of rows

In DB2 is there a way to assign a column value based on the first x%, then y% and remaining z% of rows?
I've tried using row_number() function but no luck!
Example below
Assuming that the below example count(id) is already arranged in descending order
Input:
ID count(id)
5 10
3 8
1 5
4 3
2 1
Output:
First 30% rows of the above input should be assigned code H, last 30% of the rows will have code L and remaining will have code M. If 30% of rows evaluates to decimal then round up-to 0 decimal place.
ID code
5 H
3 H
1 M
4 L
2 L
You can use window functions:
select t.id,
(case ntile(3) over (order by count(id) desc)
when 1 then 'H'
when 2 then 'M'
when 3 then 'L'
end) as grp
from t
group by t.id;
This puts them into equal sized groups.
For 30-40-30% split with your conditions, you have to be more careful:
select t.id,
(case when (seqnum - 1.0) < 0.3 * cnt then 'H'
when (seqnum + 1.0) > 0.7 * cnt then 'L'
else 'M'
end) as grp
from (select t.id,
count(*) as cnt,
count(*) over () as num_ids,
row_number() over (order by count(*) desc) as seqnum
from t
group by t.id
) t
Try this:
with t(ID, count_id) as (values
(5, 10)
, (3, 8)
, (1, 5)
, (4, 3)
, (2, 1)
)
select t.*
, case
when pst <=30 then 'H'
when pst <=70 then 'M'
else 'L'
end as code
from
(
select t.*
, rownumber() over (order by count_id desc) as rn
, 100*rownumber() over (order by count_id desc)/nullif(count(1) over(), 0) as pst
from t
) t;
The result is:
ID COUNT_ID RN PST CODE
-- -------- -- --- ----
5 10 1 20 H
3 8 2 40 M
1 5 3 60 M
4 3 4 80 L
2 1 5 100 L

Retrieve the minimum value of a column from the max value of another column

I have a hive table t1 that looks like this:
ID Score1 score2
1 4 11
1 5 12
1 5 13
2 3 14
2 3 15
2 2 12
2 2 11
3 6 10
3 6 11
3 6 12
I want for each ID, to select the max value of score1, and if the max value exists more than once, then from the rows that contain max(score1) I want to get min(score2).
So, I want the minimum score2 of the maximum score1 rows, the results should be something like this
ID Score1 score2
1 5 12
2 3 14
3 6 10
Most of the ideas I have turn this to be a very complicated query, and I think there is a simple solution for it that I am not able to find yet.
Any ideas?
Use window functions:
select t.*
from (select t.*,
row_number() over (partition by id order by score1 asc, score2 desc) as seqnum
from t
) t
where seqnum = 1;
Try:
SELECT Z.ID, Z.SCORE1, MIN(SCORE2) AS SCORE2
FROM
(SELECT A.ID, A.SCORE FROM
YOUR_TABLE A
INNER JOIN
(SELECT ID, MAX(SCORE1) FROM YOUR_TABLE GROUP BY ID) B
ON A.ID = B.ID AND A.SCORE1 = B.SCORE1
GROUP BY A.ID, A.SCORE
HAVING COUNT(*)>1
) Z
INNER JOIN YOUR_TABLE C
ON Z.ID = C.ID AND Z.SCORE1 = C.SCORE1
GROUP BY Z.ID, Z.SCORE1;
You can do this with window functions:
SELECT ID, score1, MIN(score2) AS score2
FROM (
SELECT score1, score2, ID
FROM (
SELECT score1, score2, ID
FROM MyTable
QUALIFY RANK OVER(PARTITION BY ID ORDER BY score1 DESC) > 1
) src
QUALIFY COUNT() OVER(PARTITION BY ID) > 1
) src
GROUP BY 1,2
Sorry, writing this from my phone...can't format it well, there may be syntax errors too.
select id, min(score2)
from table1 t
inner join (
select id, max(score1) maxscore1 group by id
) d on t.id = d.id and t.score1 = d.maxscore1
group by t.id
having count(*) > 1 # if the max value exists more than once
an alternative query, if the db does support "analyic functions", is
select
id, min(score2)
from (
select id, score1, score2
, count(case when score1 = max(score1) over(partition by id) then 1 end) count_max
from table1
) d
where count_max > 1 -- if the max value exists more than once
group by
id

MS Sql Server, same column with a different row neighbors

I need a little help on a SQL query. I could not get the result that I wanted.
ID I10 H 10NS HNS CC NSCC
0 1 1 1 1 14 14
1 0 1 0 1 6 2
1 0 2 0 2 12 2
1 0 3 0 3 17 4
1 0 3 0 3 18 4
1 0 3 0 3 19 4
1 0 3 0 3 20 4
What I want to have is one from each ID with highest CC
For example,
ID I10 H 10NS HNS CC NSCC
0 1 1 1 1 14 14
1 0 3 0 3 20 4
I tried with this code:
SELECT a.ID, b.name, a.i10 as[i-10-index], a.h as[h-index], 10ns as[i-10-index based on non-self-citation], a.hns as [h-index based on non-self-citation],
max(a.[Citation Count]), (a.[Non-Self-Citation Count])
FROM tbl_lpNumerical as a
join tbl_lpAcademician as b
on a.ID= (b.ID-1)
GROUP BY a.ID, b.name, a.i10, a.h, a.10ns, a.hns,
a.[Non-Self-Citation Count]
order by a.ID desc
However, I could not get the desired results.
Thank you for your time.
You can simply get all the row where not exist another row with an higher CC
SELECT n.*
FROM tbl_lpNumerical n
WHERE NOT EXISTS ( SELECT 'b'
FROM tbl_lpNumerical n2
WHERE n2.ID = n.ID
AND n2.CC > n.CC
)
In SQL Server, you can use row_number() for this. Based on your sample data`, something like:
select sd.*
from (select sd.*, row_number() over (partition by id order by cc desc) as seqnum
from sampledata sd
) sd
where seqnum = 1;
I have no idea what your query has to do with the sample data. If it generates the data, then you can use a CTE:
with sampledata as (
<some query here>
)
select sd.*
from (select sd.*, row_number() over (partition by id order by cc desc) as seqnum
from sampledata sd
) sd
where seqnum = 1;
The following query will select a single row from each ID partition: the one with the highest CC value:
SELECT *
FROM (SELECT *,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY CC DESC) AS rn
FROM mytable) t
WHERE t.rn = 1
If there can be multiple rows having the same CC max value and you want all of them selected, then you can replace ROW_NUMBER() with RANK().

Aggregate within a group of unchanged values

I have sample data:
RowId TypeId Value
1 1 34
2 1 53
3 1 34
4 2 43
5 2 65
6 16 54
7 16 34
8 1 45
9 6 43
10 6 34
11 16 64
12 16 63
I want to count row for each type (The Value does not matter to me), but only for... neighbor TypeId
TypeId Count
1 3
2 2
16 2
1 1
6 2
16 2
How to achieve this result?
This should give you COUNT of rows within a group of unchanged values:
SELECT TypeId, grp, COUNT(*) FROM (
SELECT RowId, TypeId , Value, gap, SUM(gap) over (ORDER BY RowId ) grp
FROM (SELECT RowId, TypeId , Value,
CASE WHEN TypeId = lag(TypeId) over (ORDER BY RowId )
THEN 0
ELSE 1
END gap
FROM dummy
) t
) tt
GROUP BY TypeId, grp;
If you prefer WITH over endless sub-query inclusions:
WITH dummy_with_groups AS (
SELECT RowId, TypeId , Value, SUM(gap) OVER (ORDER BY RowId) grp
FROM (SELECT RowId, TypeId , Value,
CASE WHEN TypeId = lag(TypeId) OVER (ORDER BY RowId)
THEN 0 ELSE 1 END gap
FROM dummy) t
)
SELECT TypeId, COUNT(*) as Result
FROM dummy_with_groups
GROUP BY TypeId, grp;
http://www.sqlfiddle.com/#!6/f16e9/34
Check this fiddle demo. I have renamed your columns a little.
WITH myCTE AS
(SELECT row_id,
type_id,
ROW_NUMBER () OVER (PARTITION BY type_id ORDER BY row_id)
AS cnt,
CASE LEAD (type_id) OVER (ORDER BY row_id)
WHEN type_id THEN 0
ELSE 1
END
AS show
FROM dummy),
innerQuery AS
(SELECT row_id, type_id, cnt
FROM myCTE
WHERE show = 1)
SELECT iq1.type_id, iq1.cnt - ISNULL (iq2.cnt, 0) CNT
FROM innerQuery iq1
LEFT OUTER JOIN innerQuery iq2
ON iq1.type_id = iq2.type_id
AND EXISTS
(SELECT 1
FROM innerQuery iq3
WHERE iq3.type_id = iq1.type_id
AND iq3.row_id < iq1.row_id
HAVING MAX (iq3.row_id) = iq2.row_id)
The output is exactly as expected.