Performing math on SELECT result rows - sql

I have a table that houses customer balances and I need to be able to see when accounts figures have dropped by a certain percentage over the previous month's balance per account.
My output consists of an account id, year_month combination code, and the month ending balance. So I want to see if February's balance dropped by X% from January's, and if January's dropped by the same % from December. If it did drop then I would like to be able to see what year_month code it dropped in, and yes I could have 1 account with multiple drops and I hope to see that.
Anyone have an ideas on how to perform this within SQL?
EDIT: Adding some sample data as requested. On the table I am looking at I have year_month as a column, but I do have access to get the last business day date per month as well
account_id | year_month | ending balance
1 | 2016-1 | 50000
1 | 2016-2 | 40000
1 | 2016-3 | 25
Output that I would like to see is the year_month code when the ending balance has at least a 50% decline from the previous month.

First I would recommend making Year_Month a yyyy-mm-dd format date for this calculation. Then take the current table and join it to itself, but the date that you join on will be the prior month. Then perform your calculation in the select. So you could do something like this below.
SELECT x.*,
x.EndingBalance - y.EndingBalance
FROM Balances x
INNER JOIN Balances y ON x.AccountID = y.AccountID
and x.YearMonth = DATEADD(month, DATEDIFF(month, 0, x.YearMonth) - 1, 0)

Related

Retain values till there is a change in value in Teradata

There is a transaction history table in teradata where balance gets changed only when there is a transaction
Data as below:
Cust_id Balance Txn_dt
123 1000 27MAY2018
123 350 31MAY2018
For eg,For a customer(123) on May 27 we have a balance of 1000 and on May 31 there is a transaction made by the customer so balance becomes 350. There is no record maintained for May 28 to May 30 with same balance as on May 27 . I want these days data also to be there (With same balance retained and the date is incremented ) Its like same record has to be retained for rest of the days till there is a change in a balance done by the transaction . How to do this in teradata?
Expected output:
Cust_id Balance Txn_dt
123 1000 27MAY2018
123 1000 28MAY2018
123 1000 29MAY2018
123 1000 30MAY2018
123 350 31MAY2018
Thanks
Sandy
Hi Dnoeth. It seems to work, but can you let me know how to expand till a certain day for eg : till 30JUN2018 ?
There are several ways to get this result, the simplest in Teradata utilizes Time Series Expansion for Periods:
WITH cte AS
(
SELECT Cust_id, Balance, Txn_dt,
-- return the next row's date
Coalesce(Min(Txn_dt)
Over (PARTITION BY Cust_id
ORDER BY Txn_dt
ROWS BETWEEN 1 Following AND 1 Following)
,Txn_dt+1) AS next_Txn_dt
FROM tab
)
SELECT Cust_id, Balance
,Last(pd) -- last day of the period
FROM cte
-- make a period of the current and next row's date
-- and return one row per day
EXPAND ON PERIOD(Txn_dt, next_Txn_dt) AS pd
If you run TD16.10+ you can replace the MIN OVER with a simplified LEAD:
Lead(Txn_dt)
Over (PARTITION BY Cust_id
ORDER BY Txn_dt)

Table Self Join for Year over Year comparison

I am having trouble self joining a table of data to provide year over year results in a single row.
My Data is currently stored as follows in table sales. I can work with either Postgres or sqlite3.
we(date) | store | category | planu | planrev | merchu | merchrev
by desired outcome is:
I need to be able to show values for LY for 1/7/17 in the last 4 columns.
I will then union the results with those for all other partners replicating the same query for other tables.
It can be assumed that all current year data will be we>1/1/18 to match the date 364 days ago for previous year results.
Through reading other posts I think I may need to craft a CTE query, I just don't know where to start.
I hope this was clear.
Any help in working this out would be greatly appreciated.
It looks like you want to join on identical store and category as well as same month and day of the year. That would look as follows in PostgreSQL:
select
'PartnerA' as channel,
cy.we as date,
cy.month,
cy.year,
cy.store,
'PartnerA ' || cy.store as ch_store,
cy.category,
cy.planu,
cy.planrev,
cy.merchu,
cy.merchrev,
ly.planu as planu_ly,
ly.planrev as planrev_ly,
ly.merchu as merchu_ly,
ly.merchrev as merchrev_ly
from sales cy
join sales ly on cy.store = ly.store and cy.category = ly.category
and cy.we - interval '1 year' = ly.we
;

SQL - How can I sum up a column after the results have been grouped and filtered in the having clause?

Here is my current query: The objective is to find accounts that have received at least $500 in deposits within 30 days of their first deposit. Some accounts have been closed and re-opened, hence the first line of the 'WHERE' clause.
select Deposits.accountNumber,
min(Deposits.transDate) as "first deposit",
Deposits.transDate,
CAST(DATEADD(d,30,min(Deposits.transDate)) as date) as "30 days",
sum(Deposits.amount) as "sum",
Deposits.amount,
Members.accountOpenDate
from Deposits
inner join Members on Deposits.accountNumber = members.accountNumber
where Deposits.transDate >= members.accountOpenDate
and Deposits.accountNumber = 123456
group by Deposits.accountNumber
having Deposits.transDate between min(Deposits.transDate) and DATEADD('d',30,min(Deposits.transDate))
and sum(Deposits.amount) >= 500
The problem I am running into, is that the last line of the HAVING statement:
and sum(Deposits.amount) >= 500
is including all of the transactions for the account, as if there was no 'HAVING' clause. It is factoring in transactions that are excluded from the first line of the 'HAVING':
having Deposits.transDate between min(Deposits.transDate) and DATEADD('d',30,min(Deposits.transDate))
Here is what my data looks like (without grouping by account number):
accountNumber amount sum
123456 $100 $6,500
123456 $50 $6,500
123456 $50 $6,500
And here is what I am trying to get to:
accountNumber amount sum
123456 $100 $200
123456 $50 $200
123456 $50 $200
Thanks in advance. My DBMS is Intersystems-Cache. A link to their reference can be found Here.
You can try something like that:
select filtered.accountNumber,
min(filtered.transDate) as "first deposit",
filtered.transDate,
CAST(DATEADD(d,30,min(filtered.transDate)) as date) as "30 days",
sum(filtered.amount) as "sum",
filtered.amount,
filtered.accountOpenDate
from
(
select * from Deposits
inner join Members on Deposits.accountNumber = members.accountNumber
where Deposits.transDate >= members.accountOpenDate
and Deposits.accountNumber = 123456
having Deposits.transDate between min(Deposits.transDate) and DATEADD('d',30,min(Deposits.transDate))
) as filtered
group by filtered.accountNumber
having sum(filtered.amount) >= 500
With a query like that one you are first filtering your data applying the transDate condition then you can operate the filter on the sum of the amount
We need clarification:
1. Are the 3 transactions you show all within the 30 day window? If yes, then the total is less than $500. So, this account should be skipped.
2. Since $6500 is the total of all trans greater than the open date, why even calculate it? You only care about the 30 day window.
Besides that, I think the disconnect is the date calculation in the HAVING clause. You use MIN in the SELECT, but use a totally different aggregate date calculation in the HAVING. I think you should take the calculation out of the HAVING and make it part of the WHERE.
Of course, once you do that, you'll have to take the MIN out of the SELECT.

How to count the number of active days in a dataset with SQL Server 2008

SQL Server 2008, rendered in html via aspx webpage.
What I want to achieve, is to get an average per day figure that makes allowance for missing days. To do this I need to count the number of active days in a table.
Example:
Date | Amount
---------------------
2014-08-16 | 234.56
2014-08-16 | 258.30
2014-08-18 | 25.84
2014-08-19 | 259.21
The sum of the lot (777.961) divided by the number of active days (3) would = 259.30
So it needs to go "count number of different dates in the returned range"
Is there a tidy way to do this?
If you just want that one row of output then this should work:
select sum(amount) / count(distinct date) as your_average
from your_table
Fiddle:
http://sqlfiddle.com/#!2/7ffd1/1/0
I don't know this will be help to you, how about using Group By, Avg, count function.
SELECT Date, AVG(Amount) AS 'AmountAverage', COUNT(*) AS 'NumberOfActiveDays'
FROM YourTable WITH(NOLOCK)
GROUP BY Date
About AVG function, see here: Link

oracle sql: efficient way to calculate business days in a month

I have a pretty huge table with columns dates, account, amount, etc. eg.
date account amount
4/1/2014 XXXXX1 80
4/1/2014 XXXXX1 20
4/2/2014 XXXXX1 840
4/3/2014 XXXXX1 120
4/1/2014 XXXXX2 130
4/3/2014 XXXXX2 300
...........
(I have 40 months' worth of daily data and multiple accounts.)
The final output I want is the average amount of each account each month. Since there may or may not be record for any account on a single day, and I have a seperate table of holidays from 2011~2014, I am summing up the amount of each account within a month and dividing it by the number of business days of that month. Notice that there is very likely to be record(s) on weekends/holidays, so I need to exclude them from calculation. Also, I want to have a record for each of the date available in the original table. eg.
date account amount
4/1/2014 XXXXX1 48 ((80+20+840+120)/22)
4/2/2014 XXXXX1 48
4/3/2014 XXXXX1 48
4/1/2014 XXXXX2 19 ((130+300)/22)
4/3/2014 XXXXX2 19
...........
(Suppose the above is the only data I have for Apr-2014.)
I am able to do this in a hacky and slow way, but as I need to join this process with other subqueries, I really need to optimize this query. My current code looks like:
<!-- language: lang-sql -->
select
date,
account,
sum(amount/days_mon) over (partition by last_day(date))
from(
select
date,
-- there are more calculation to get the account numbers,
-- so this subquery is necessary
account,
amount,
-- this is a list of month-end dates that the number of
-- business days in that month is 19. similar below.
case when last_day(date) in ('','',...,'') then 19
when last_day(date) in ('','',...,'') then 20
when last_day(date) in ('','',...,'') then 21
when last_day(date) in ('','',...,'') then 22
when last_day(date) in ('','',...,'') then 23
end as days_mon
from mytable tb
inner join lookup_businessday_list busi
on tb.date = busi.date)
So how can I perform the above purpose efficiently? Thank you!
This approach uses sub-query factoring - what other RDBMS flavours call common table expressions. The attraction here is that we can pass the output from one CTE as input to another. Find out more.
The first CTE generates a list of dates in a given month (you can extend this over any range you like).
The second CTE uses an anti-join on the first to filter out dates which are holidays and also dates which aren't weekdays. Note that Day Number varies depending according to the NLS_TERRITORY setting; in my realm the weekend is days 6 and 7 but SQL Fiddle is American so there it is 1 and 7.
with dates as ( select date '2014-04-01' + ( level - 1) as d
from dual
connect by level <= 30 )
, bdays as ( select d
, count(d) over () tot_d
from dates
left join holidays
on dates.d = holidays.hol_date
where holidays.hol_date is null
and to_number(to_char(dates.d, 'D')) between 2 and 6
)
select yt.account
, yt.txn_date
, sum(yt.amount) over (partition by yt.account, trunc(yt.txn_date,'MM'))
/tot_d as avg_amt
from your_table yt
join bdays
on bdays.d = yt.txn_date
order by yt.account
, yt.txn_date
/
I haven't rounded the average amount.
You have 40 month of data, this data should be very stable.
I will assume that you have a cold body (big and stable easily definable range of data) and hot tail (small and active part).
Next, I would like to define a minimal period. It is a data range that is a smallest interval interesting for Business.
It might be year, month, day, hour, etc. Do you expect to get questions like "what was averege for that account between 1900 and 12am yesterday?".
I will assume that the answer is DAY.
Then,
I will calculate sum(amount) and count() for every account for every DAY of cold body.
I will not create a dummy records, if particular account had no activity on some day.
and I will save day, account, total amount, count in a TABLE.
if there are modifications later to the cold body, you delete and reload affected day from that table.
For hot tail there might be multiple strategies:
Do the same as above (same process, clear to support)
always calculate on a fly
use materialized view as an averege between 1 and 2.
Cold body table totalc could also be implemented as materialized view, but if data never change - no need to rebuild it.
With this you go from (number of account) x (number of transactions per day) x (number of days) to (number of account)x(number of active days) number of records.
That should speed up all following calculations.