How to find regions where total of their sale exceeded 60% - sql

I have a table interest_summary table with two columns:
int_rate number,
total_balance number
example
10.25 50
10.50 100
10.75 240
11.00 20
My query should return in 2 columns or a string like 10.50 to 10.75 because adding their total exceed 60% of total amount added together
Could you suggest a logic in Oracle?

select
min(int_rate),
max(int_rate)
from
(
select
int_rate,
nvl(sum(total_balance) over(
order by total_balance desc
rows between unbounded preceding and 1 preceding
),0) as part_sum
from interest_summary
)
where
part_sum < (select 0.6*sum(total_balance) from interest_summary)
fiddle

I'm assuming that you're selecting the rows based on the following algorithm:
Sort your rows by total_balance (descending)
Select the highest total_balance row remaining
If its total_balance added to the running total of the total balance is under 60%, add it to the pool and get the next row (step 2)
If not add the row to the pool and return.
The sorted running total looks like this (I'll number the rows so that it's easier to understand what happens):
SQL> WITH data AS (
2 SELECT 1 id, 10.25 interest_rate, 50 total_balance FROM DUAL
3 UNION ALL SELECT 2 id, 10.50 interest_rate, 100 total_balance FROM DUAL
4 UNION ALL SELECT 3 id, 10.75 interest_rate, 240 total_balance FROM DUAL
5 UNION ALL SELECT 4 id, 11.00 interest_rate, 20 total_balance FROM DUAL
6 )
7 SELECT id, interest_rate,
8 SUM(total_balance) OVER (ORDER BY total_balance DESC) running_total,
9 SUM(total_balance) OVER (ORDER BY total_balance DESC)
10 /
11 SUM(total_balance) OVER () * 100 pct_running_total
12 FROM data
13 ORDER BY 3;
ID INTEREST_RATE RUNNING_TOTAL PCT_RUNNING_TOTAL
---------- ------------- ------------- -----------------
3 10,75 240 58,5365853658537
2 10,5 340 82,9268292682927
1 10,25 390 95,1219512195122
4 11 410 100
So in this example we must return rows 3 and 2 because row 2 is the first row where its percent running total is above 60%:
SQL> WITH data AS (
2 SELECT 1 id, 10.25 interest_rate, 50 total_balance FROM DUAL
3 UNION ALL SELECT 2 id, 10.50 interest_rate, 100 total_balance FROM DUAL
4 UNION ALL SELECT 3 id, 10.75 interest_rate, 240 total_balance FROM DUAL
5 UNION ALL SELECT 4 id, 11.00 interest_rate, 20 total_balance FROM DUAL
6 )
7 SELECT ID, interest_rate
8 FROM (SELECT ID, interest_rate,
9 SUM(over_limit)
10 OVER(ORDER BY total_balance DESC) over_limit_no
11 FROM (SELECT id,
12 interest_rate,
13 total_balance,
14 CASE
15 WHEN SUM(total_balance)
16 OVER(ORDER BY total_balance DESC)
17 / SUM(total_balance) OVER() * 100 < 60 THEN
18 0
19 ELSE
20 1
21 END over_limit
22 FROM data
23 ORDER BY 3))
24 WHERE over_limit_no <= 1;
ID INTEREST_RATE
---------- -------------
3 10,75
2 10,5

Related

Select max of nested id from amazon redshift

My database is an amazon redshift.
I have a table that looks like this -
id
nested_id
date
value
1
10
'2021-01-01'
5
1
20
'2021-01-01'
10
1
10
'2021-01-02'
6
1
20
'2021-01-02'
11
1
10
'2021-01-03'
7
1
20
'2021-01-03'
12
2
30
'2021-01-01'
5
2
40
'2021-01-01'
10
2
30
'2021-01-02'
6
2
40
'2021-01-02'
11
2
30
'2021-01-03'
7
2
40
'2021-01-03'
12
So this is basically a table that tracks values by id over time, except for every id there can be a nested_id. And the dates and values are primarily connected to the nested_id.
However, let's say I'm starting with the id field, but for each id I want to only return the points over time for the nested_id that has the greater sum of points.
So right now I'm just grabbing it like this...
select *
from mytable
where id in (1, 2)
except I only want it to return nested_id rows where the maximum value of that nested_id is the greatest.
So here's how I would do this manually.
For id of 1, the maximum value is 12, and the nested_id of that value is 20
For id of 2, the maximum value is 12, and the nested_id of that value is 40
So my return table should be
id
nested_id
date
value
1
20
'2021-01-01'
10
1
20
'2021-01-02'
11
1
20
'2021-01-03'
12
2
40
'2021-01-01'
10
2
40
'2021-01-02'
11
2
40
'2021-01-03'
12
Is there an easy way of performing this query? I'm assuming you have to partition somehow?
You can solve this with row_number window functions
with maxs as (
select id,
nested_id,
value,
row_number() over (partition by id order by value desc) rn
from mytable
)
select mt.*
from mytable mt
left join maxs on mt.id = maxs.id and mt.nested_id = maxs.nested_id
where maxs.rn = 1

Running assignment of values with break T-SQL

With the below table of data
Customer
Amount Billed
Amount Paid
Date
1
100
60
01/01/2000
1
100
40
01/02/2000
2
200
150
01/01/2000
2
200
30
01/02/2000
2
200
10
01/03/2000
2
200
15
01/04/2000
I would like to create the next two columns
Customer
Amount Billed
Amount Paid
Assigned
Remainder
Date
1
100
60
60
40
01/01/2000
1
100
40
40
0
01/02/2000
2
200
150
150
50
01/01/2000
2
200
30
30
20
01/02/2000
2
200
10
10
10
01/03/2000
2
200
15
10
-5
01/04/2000
The amount paid on each line should be removed from the amount billed and pushed onto the next line for the same customer. The process should continue until there are no more records or the remainder is < 0.
Is there a way of doing this without a cursor? Maybe a recursive CTE?
Thanks
As I mentioned in the comments, this is just a cumulative SUM:
WITH YourTable AS(
SELECT *
FROM (VALUES(1,100,60 ,CONVERT(date,'01/01/2000')),
(1,100,40 ,CONVERT(date,'01/02/2000')),
(2,200,150,CONVERT(date,' 01/01/2000')),
(2,200,30 ,CONVERT(date,'01/02/2000')),
(2,200,10 ,CONVERT(date,'01/03/2000')),
(2,200,15 ,CONVERT(date,'01/04/2000')))V(Customer,AmountBilled,AmountPaid,[Date]))
SELECT Customer,
AmountBilled,
AmountPaid,
AmountBilled - SUM(AmountPaid) OVER (PARTITION BY Customer ORDER BY [Date] ASC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Remainder,
[Date]
FROM YourTable
ORDER BY Customer,
[Date];
Note this returns -5 for the last row, not 5, as 200 - 205 = -5. If you want 5 wrap the whole expression in an absolute function.
You can achieve this using recursive CTE as well.
DECLARE #customer table (Customer int, AmountBilled int, AmountPaid int, PaidDate date)
insert into #customer
values
(1 ,100, 60 ,'01/01/2000')
,(1 ,100, 40 ,'01/02/2000')
,(2 ,200, 150 ,'01/01/2000')
,(2 ,200, 30 ,'01/02/2000')
,(2 ,200, 10 ,'01/03/2000')
,(2 ,200, 15 ,'01/04/2000');
;WITH CTE_CustomerRNK as
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY customer order by paiddate) AS RNK
from #customer),
CTE_Customer as
(
SELECT customer, AmountBilled, AmountPaid, (amountbilled-amountpaid) as remainder, paiddate ,RNK FROM CTE_CustomerRNK where rnk = 1
union all
SELECT r.customer, r.AmountBilled, r.AmountPaid, (c.remainder - r.AmountPaid) as remainder, r.PaidDate, r.rnk
FROM CTE_CustomerRNK as r
inner join CTE_Customer as c
on c.Customer = r.Customer
and r.rnk = c.rnk + 1
)
SELECT customer, AmountBilled, AmountPaid, remainder, paiddate
FROM CTE_Customer order by Customer
customer
AmountBilled
AmountPaid
remainder
paiddate
1
100
60
40
2000-01-01
1
100
40
0
2000-01-02
2
200
150
50
2000-01-01
2
200
30
20
2000-01-02
2
200
10
10
2000-01-03
2
200
15
-5
2000-01-04

Estimation of Cumulative value every 3 months in SQL

I have a table like this:
ID Date Prod
1 1/1/2009 5
1 2/1/2009 5
1 3/1/2009 5
1 4/1/2009 5
1 5/1/2009 5
1 6/1/2009 5
1 7/1/2009 5
1 8/1/2009 5
1 9/1/2009 5
And I need to get the following result:
ID Date Prod CumProd
1 2009/03/01 5 15 ---Each 3 months
1 2009/06/01 5 30 ---Each 3 months
1 2009/09/01 5 45 ---Each 3 months
What could be the best approach to take in SQL?
You can try the below - using window function
DEMO Here
select * from
(
select *,sum(prod) over(order by DATEPART(qq,dateval)) as cum_sum,
row_number() over(partition by DATEPART(qq,dateval) order by dateval) as rn
from t
)A where rn=1
How about just filtering on the month number?
select t.*
from (select id, date, prod, sum(prod) over (partition by id order by date) as running_prod
from t
) t
where month(date) in (3, 6, 9, 12);

Identify New Seller (without buying in recent 3 months)

In my SQL - BigQuery, I have a table with 3 columns: Month, Date, ID about records of transactions of users.
Here is the example
I want to identify which ID is the new seller in each month, the definition of a new seller is the seller without buying the recent 3 months.
I tried to sort row_number the ID order by date, ID. I reckon that the row_number not in (2,3,4) is the new seller. However, ID can skip 1 month and rebuy next month, my code doesn't work with this situation.
Could you please help me to solve this problem? Thank you very much.
Below is for BigQuery Standard SQL
#standardSQL
SELECT *,
COUNT(1) OVER(
PARTITION BY id
ORDER BY DATE_DIFF(`date`, '2000-01-01', MONTH)
RANGE BETWEEN 4 PRECEDING AND 1 PRECEDING
) = 0 AS new_seller
FROM `project.dataset.table`
You can test, play with above using sample data from your question as in below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'Mar-19' month, DATE '2019-03-01' `date`, 1 id UNION ALL
SELECT 'Mar-19', '2019-03-03', 2 UNION ALL
SELECT 'Mar-19', '2019-03-04', 3 UNION ALL
SELECT 'Apr-19', '2019-04-05', 3 UNION ALL
SELECT 'Apr-19', '2019-04-06', 4 UNION ALL
SELECT 'Apr-19', '2019-04-07', 5 UNION ALL
SELECT 'May-19', '2019-05-03', 3 UNION ALL
SELECT 'May-19', '2019-05-04', 6 UNION ALL
SELECT 'May-19', '2019-05-05', 5 UNION ALL
SELECT 'Jun-19', '2019-06-06', 1 UNION ALL
SELECT 'Jun-19', '2019-06-07', 7 UNION ALL
SELECT 'Jun-19', '2019-06-08', 8 UNION ALL
SELECT 'Jun-19', '2019-06-09', 9 UNION ALL
SELECT 'Jul-19', '2019-07-05', 2 UNION ALL
SELECT 'Jul-19', '2019-07-06', 5 UNION ALL
SELECT 'Jul-19', '2019-07-07', 9
)
SELECT *,
COUNT(1) OVER(
PARTITION BY id
ORDER BY DATE_DIFF(`date`, '2000-01-01', MONTH)
RANGE BETWEEN 4 PRECEDING AND 1 PRECEDING
) = 0 AS new_seller
FROM `project.dataset.table`
-- ORDER BY `date`
with below output
Row month date id new_seller
1 Mar-19 2019-03-01 1 true
2 Mar-19 2019-03-03 2 true
3 Mar-19 2019-03-04 3 true
4 Apr-19 2019-04-05 3 false
5 Apr-19 2019-04-06 4 true
6 Apr-19 2019-04-07 5 true
7 May-19 2019-05-03 3 false
8 May-19 2019-05-04 6 true
9 May-19 2019-05-05 5 false
10 Jun-19 2019-06-06 1 false
11 Jun-19 2019-06-07 7 true
12 Jun-19 2019-06-08 8 true
13 Jun-19 2019-06-09 9 true
14 Jul-19 2019-07-05 2 false
15 Jul-19 2019-07-06 5 false
16 Jul-19 2019-07-07 9 false

How can I add a subtotal of specific column values?

Here is the DATASET:
and here is the SQL I have:
select f.DATE, f.PROD_STATUS,
count (*) AS TOTAL
from PROD_TABLE f
where DATE = '04-MAY-17'
GROUP BY f.DATE, f.PROD_STATUS
I'm trying to get the value for 'SUCCESS' as a column in the SQL Results:
(SUCCESS = READY_1 + READY_2 + READY_3 + READY_4 + READY_5)
I want the SQL results to look like this:
.
How can I achieve that?
Check this:
with t as (
select 1 as ready_1,
2 as ready_2,
3 as ready_3,
1 as in_process,
4 as fail,
5 as crash,
'5/4/2017' as dat
from dual
union all
select 2 as ready_1,
2 as ready_2,
3 as ready_3,
1 as in_process,
4 as fail,
0 as crash,
'5/5/2017' as dat
from dual
)
select dat, prod_stat, max(suc) over(partition by dat) as success, sum(value) over(partition by dat) as total
from (
select dat, prod_stat, value, sum(value) over (partition by dat) as suc
from t
unpivot(
value for prod_stat in (ready_1, ready_2, ready_3)
)
union all
select dat, prod_stat, value, null as suc
from t
unpivot(
value for prod_stat in (in_process, fail, crash)
)
)
Result:
DAT PROD_STAT SUCCESS TOTAL
5/4/2017 READY_2 6 16
5/4/2017 READY_1 6 16
5/4/2017 CRASH 6 16
5/4/2017 FAIL 6 16
5/4/2017 IN_PROCESS 6 16
5/4/2017 READY_3 6 16
5/5/2017 FAIL 7 12
5/5/2017 IN_PROCESS 7 12
5/5/2017 CRASH 7 12
5/5/2017 READY_2 7 12
5/5/2017 READY_1 7 12
5/5/2017 READY_3 7 12