Ranking a record based on a pre-calculated field - sql

I have this query and I am trying to figure out how to rank
in the DENSE_RANK area by 'Status'. Status is a row from a case when statement when the current month measure is greater than the 6 mos average.
Basically I want to rank the largest to smallest 'GoalAchieved' status partitioned by territory by Order by sum(revenue). Same with the inprogress status.
So each territory should have a rank of 1,2,3,4 for goalachieved status and the same thing for 'Inprogress' Status.
Any help would be awesome.
SELECT DISTINCT a.ClinLocationID, Terrioty,
ClinLocationName, Round(Sum(revenue),2)
CASE WHEN CM > LastSixmosAvg THEN 'GoalAchieved'
ELSE 'InProgress'
END as Status,
--3 month test
CASE WHEN CM > Last3MosAvg THEN 'Winner'
ELSE 'Loser'
END as ThreeMonthStatus,
CASE WHEN CM > LastSixmosAvg THEN RANK () OVER (partition by Terrioty order by Sum(Revenue) desc)

Related

Window function or Recursive query in Redshift

I try to classify customer based on monthly sales. Logic for calculation:
new customer – it has sales and appears for first time or has sales and returned after being lost (after 4 month period, based on sales_date)
active customer - not new and not lost.
lost customer - no sales and period (based on sales_date) more than 4 months
This is the desired output I'm trying to achieve:
The below Window function in Redshift classify however it is not correct.
It classified lost when difference between month > 4 in one row, however it did not classify lost if it was previous lost and revenue 0 until new status appear. How it can be updated?
with customer_status (
select customer_id,customer_name,sales_date,sum(sales_amount) as sales_amount_calc,
nvl(DATEDIFF(month, LAG(reporting_date) OVER (PARTITION BY customer_id ORDER by sales_date ASC), sales_date),0) AS months_between_sales
from customer
GROUP BY customer_id,customer_name,sales_date
)
select *,
case
WHEN months_between_sales = 0 THEN 'New'
WHEN months_between_sales > 0 THEN 'Active'
WHEN months_between_sales > 0 AND months_between_sales <= 4 and sales_amount_calc = 0 THEN 'Active'
WHEN /* months_between_sales > 0 and */ months_between_sales > 4 and sales_amount_calc = 0 THEN 'Lost'
ELSE 'Unknown'
END AS status
from customer_status
One way to solve to get cumulative sales amount partitioned on subset of potentially lost customers ( sales_amount = 0).
Cumulative amount for the customer partitioned
sum(months_between_sales) over (PARTITION BY customer_id ORDER by sales_date ASC rows unbounded preceding) as cumulative_amount,
How can it be changed to get sub-partitioned, for example when sales amount= 0 , in order to get lost correctly?
Does someone have ideas to translate the above logic into an
recursive query in Redshift?
Thank you

SQL - Rank monthly dataset high, medium, low

I have a table which includes the month, accountID and a set of application scores. I want to create a new column which either gives a 'high', 'medium' or 'low' for the top, middle and bottom 33% of the results each month.
If I use rank() I can order the application scores for a single month or the whole dataset but I'm unsure how to order it per month. Also, on my version of sql server percent_rank() does not work.
select
AccountID
, ApplicationScore
, rank() over (order by applicationscore asc) as Rank
from Table
I then know I need to put the rank() statement in a subquery and then use a case statement to apply the 'high', 'medium' or 'low'.
select
AccountID
, case when rank <= total/3 then 'low'
when rank > total/3 and rank <= (total/3)*2 then 'medium'
when rank > (total/3)*2 then 'high' end ApplicationScore
from (subquery) a
Ntile(3) worked very well
select
AccountID
, Monthstart
, ApplicationScore
, ntile(3) over (partition by monthstart order by applicationscore) Rank
from table
SQL Server may have something built in to handle your problem. But we can easily use a ratio of counts to find the three segments of your scores, for each month. The ratio we can use is the count, partitioned by month and ordered by score, divided by the count for the entire month.
WITH cte AS (
SELECT *,
1.0 * COUNT(*) OVER (PARTITION BY Month ORDER BY ApplicationScore) /
COUNT(*) OVER (PARTITION BY Month) AS cnt
FROM yourTable
)
SELECT
AccountID,
Month,
ApplicationScore,
CASE WHEN cnt < 0.34 THEN 'low'
WHEN cnt < 0.67 THEN 'medium'
ELSE 'high' END AS rank
FROM cte
ORDER BY
Month,
ApplicationScore DESC;
Demo

SQL sum over(partition) not subtracting negative values in SUM

I have the following query which outputs a list of transactions per user - units spent and units earned - column 'Amount'.
I have managed to group this per user and do a running total - column 'Running_Total_Spend'.
However it is ADDING the negative 'Amount' values rather than subtracting them. Sp pretty sure it is the SUM part of query not working.
WITH cohort AS(
SELECT DISTINCT userID FROM events_live WHERE startDate = '2018-07-26' LIMIT 50),
my_events AS (
SELET events_live.* FROM events_live WHERE eventDate >= '2018-07-26')
SELECT cohort.userID,
my_events.eventDate,
my_events.eventTimestamp,
CASE
--spent resource outputs a negative value ---working
WHEN transactionVector = 'SPENT' THEN -abs(my_events.productAmount)
--earned resource outputs a positive value ---working
WHEN transactionVector = 'RECEIVED' THEN my_events.productAmount END AS Amount,
ROW_NUMBER() OVER (PARTITION BY cohort.userID ORDER BY cohort.userID, eventTimestamp asc) AS row,
--sum the values in column 'Amount' for this partition
--should sum positive and negative values ---NOT WORKING--converting negatives into positive
--------------------------------------------------
SUM(CASE WHEN my_events.productAmount >= 0 THEN my_events.productAmount
WHEN my_events.productAmount <0 THEN -abs(my_events.productAmount) end) OVER(PARTITION BY cohort.userID ORDER BY cohort.userID, eventTimestamp asc) AS Running_Total_Spend
---------------------------------------------------
FROM cohort
INNER JOIN my_events ON cohort.userID=my_events.userID
WHERE productName = 'COINS' AND transactionVector IN ('SPENT','RECEIVED')
I suspect you want that logic around transactionvector for the sum too as my_events.productamount seems to be always positive.
...
sum(CASE
WHEN transactionvector = 'SPENT' THEN
-my_events.productamount
WHEN transactionvector = 'RECEIVED' THEN
my_events.productamount
END) OVER (PARTITION BY cohort.userid
ORDER BY cohort.userid,
eventTimestamp) running_total_spend
...
Update your sum function to -
SUM(my_events.productAmount) OVER(PARTITION BY cohort.userID ORDER BY cohort.userID, eventTimestamp asc) AS Running_Total_Spend

Query for getting previous date in oracle in specific scenario

I have the below data in a table A which I need to insert into table B along with one computed column.
TABLE A:
Account_No | Balance | As_on_date
1001 |-100 | 1-Jan-2013
1001 |-150 | 2-Jan-2013
1001 | 200 | 3-Jan-2013
1001 |-250 | 4-Jan-2013
1001 |-300 | 5-Jan-2013
1001 |-310 | 6-Jan-2013
Table B:
In table B, there should be no of days to be shown when balance is negative and
the date one which it has gone into negative.
So, for 6-Jan-2013, this table should show below data:
Account_No | Balance | As_on_date | Days_passed | Start_date
1001 | -310 | 6-Jan-2013 | 3 | 4-Jan-2013
Here, no of days should be the days when the balance has gone negative in recent time and
not from the old entry.
I need to write a SQL query to get the no of days passed and the start date from when the
balance has gone negative.
I tried to formulate a query using Lag analytical function, but I am not succeeding.
How should I check the first instance of negative balance by traversing back using LAG function?
Even the first_value function was given a try but not getting how to partition in it based on negative value.
Any help or direction on this will be really helpful.
Here's a way to achive this using analytical functions.
INSERT INTO tableb
WITH tablea_grouped1
AS (SELECT account_no,
balance,
as_on_date,
SUM (CASE WHEN balance >= 0 THEN 1 ELSE 0 END)
OVER (PARTITION BY account_no ORDER BY as_on_date)
grp
FROM tablea),
tablea_grouped2
AS (SELECT account_no,
balance,
as_on_date,
grp,
LAST_VALUE (
balance)
OVER (
PARTITION BY account_no, grp
ORDER BY as_on_date
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING)
closing_balance
FROM tablea_grouped1
WHERE balance < 0
AND grp != 0 --keep this, if starting negative balance is to be ignored
)
SELECT account_no,
closing_balance,
MAX (as_on_date),
MAX (as_on_date) - MIN (as_on_date) + 1,
MIN (as_on_date)
FROM tablea_grouped2
GROUP BY account_no, grp, closing_balance
ORDER BY account_no, MIN (as_on_date);
First, SUM is used as analytical function to assign group number to consecutive balances less than 0.
LAST_VALUE function is then used to find the last -ve balance in each group
Finally, the result is aggregated based on each group. MAX(date) gives the last date, MIN(date) gives the starting date, and the difference of the two gives number of days.
Demo at sqlfiddle.
Try this and use gone_negative to computing specified column value for insert into another table:
select temp.account_no,
temp.balance,
temp.prev_balance,
temp.on_date,
temp.prev_on_date,
case
WHEN (temp.balance < 0 and temp.prev_balance >= 0) THEN
1
else
0
end as gone_negative
from (select account_no,
balance,
on_date,
lag(balance, 1, 0) OVER(partition by account_no ORDER BY account_no) prev_balance,
lag(on_date, 1) OVER(partition by account_no ORDER BY account_no) prev_on_date
from tblA
order by account_no) temp;
Hope this helps pal.
Here's on way to do it.
Select all records from my_table where the balance is positive.
Do a self-join and get all the records that have a as_on_date is greater than the current row, but the amounts are in negative
Once we get these, we cut-off the rows WHERE the date difference between the current and the previous row for as_on_date is > 1. We then filter the results a outer sub query
The Final select just groups the rows and gets the min, max values for the filtered rows which are grouped.
Query:
SELECT
account_no,
min(case when row_number = 1 then balance end) as balance,
max(mt2_date) as As_on_date,
max(mt2_date) - mt1_date as Days_passed,
min(mt2_date) as Start_date
FROM
(
SELECT
*,
MIN(break_date) OVER( PARTITION BY mt1_date ) AS min_break_date,
ROW_NUMBER() OVER( PARTITION BY mt1_date ORDER BY mt2_date desc ) AS row_number
FROM
(
SELECT
mt1.account_no,
mt2.balance,
mt1.as_on_date as mt1_date,
mt2.as_on_date as mt2_date,
case when mt2.as_on_date - lag(mt2.as_on_date,1) over () > 1 then mt2.as_on_date end as break_date
FROM
my_table mt1
JOIN my_table mt2 ON ( mt2.balance < mt1.balance AND mt2.as_on_date > mt1.as_on_date )
WHERE
MT1.balance > 0
order by
mt1.as_on_date,
mt2.as_on_date ) sub_query
) T
WHERE
min_break_date is null
OR mt2_date < min_break_date
GROUP BY
mt1_date,
account_no
SQLFIDDLE
I have a added a few more rows in the FIDDLE, just to test it out

Sum Until Value Reached - Teradata

In Teradata, I need a query to first identify all members in the MEM TABLE that currently have a negative balance, let's call that CUR_BAL. Then, for all of those members only, sum all transactions from the TRAN TABLE in order by date until the sum of those transactions is equal to the CUR_BAL.
Editing to add a third ADJ table that contains MEM_NBR, ADJ_DT and ADJ_AMT that need to be included in the running total in order to capture all of the records.
I would like the outcome to include the MEM.MEM_NBR, MEM.CUR_BAL, TRAN.TRAN_DATE OR ADJ.ADJ_DT (date associated with the transaction that resulted in the running total to equal CUR_BAL), MEM.LST_UPD_DT. I don't need to know if the balance is negative as a result of a transaction or adjustment, just the date that it went negative.
Thank you!
select
mem_nbr,
cur_bal,
tran_date,
tran_type
from (
select
a.mem_nbr,
a.cur_bal,
b.tran_date,
b.tran_type,
a.lst_upd_dt,
sum(b.tran_amt) over (partition by b.mem_nbr order by b.tran_date rows between unbounded preceding and current row) as cumulative_bal
from mem a
inner join (
select
mem_nbr,
tran_date,
tran_amt,
'Tran' as tran_type
from tran
union all
select
mem_nbr,
adj_date,
adj_amt,
'Adj' as tran_type
from adj
) b
on a.mem_nbr = b.mem_nbr
where a.cur_bal < 0
qualify cumulative_bal < 0
) z
qualify rank() over (partition by mem_nbr order by tran_date) = 1
The subquery picks up all instances where the cumulative balance is negative, then the outer query picks up the earliest instance of it. If you want the latest, add desc after tran_date in the final qualify line.