Forecast based on monthly figures - sql

I receive a monthly list of unpresented cheques with payment date. Those reached certain age in Month1 (say, 90 days to date) are recognized as overdue and counted as X(1).
Those of > 60 days age Y(1) are near-overdue and will appear in Month2 statement as X(2), if not banked by then.
Some of cheques arу pretty old and reside in the system for ages (over 1000 days), and therefore appear in each X monthly statement, while certain share of cheques in both X(n) and Y(n) will disappear from next month's X(n+1).
What would be the best logic for next month forecast basing on actual historical data? The most importand is X, but Y also welcome. It should be a forecast, as there is no next month's data as of yet.
The data is in SQL if it is relevant, but I need to understand the logic most, then I can generate code.

The logic below gets you the Number of Checks, Percent Overdue, Percent Near Overdue, and Percent All Other Checks counts for the current time period. Next step would be to create a query that does this for 3 or 12 months (or however many months you want to use to get projections). Then you trend out your total number of checks, and trend the percentages for each of the subcategories. You can use that to predict future amounts.
SELECT
Count(*) AS Number of Checks,
(SUM(CASE WHEN s.ageofcheck >= 90 THEN 1 ELSE 0 END))/Count(*) AS Number of Checks AS Percent Overdue Checks,
(SUM(CASE WHEN s.ageofcheck >= 60 AND s.ageofcheck < 90 THEN 1 ELSE 0 END))/Count(*) AS Number of Checks AS Percent Near Overdue Checks,
(SUM(CASE WHEN s.ageofcheck < 60 THEN 1 ELSE 0 END))/Count(*) AS Number of Checks AS Percent All Other Checks
FROM
(
SELECT
c.checknumber,
DATEDIFF(dd,GETDATE(),c.checkdate) AS ageofcheck
FROM
checks_table AS c
) AS s

Related

How to find first day of multi-day event on monthly basis?

I have a table with daily snapshot data that has a date and amount column. I'm converting this to a monthly snapshot table. Every record in this monthly table should look like what you see in the table below at 31-01.
If on the last day of the month the amount is negative, amount_negative_lastday_of_month = True. The part I can't figure out is the first_date_negative_amount. If the amount_negative_lastday_of_month = True, I want to get the first day on which the amount became negative for that specific period where the amount was negative. In the table below that would be 28-01-2021.
If this period of amount < 0 continuous on until the end of February, we will see the exact same in the Feb. monthly snapshot where amount_negative_lastday_of_month = True and amount_negative_lastday_of_month = 28-01-2021.
If this period of negative amount is over, and in a few months a new period of negative amount begins, I want to do the same again.
The problem is that I can't figure out how to group/window my data by varying periods of time where a certain condition is met (amount < 0).
How can I query this?
date
amount
last_day_of_month
amount_negative_lastday
first_date_negative_amount
26-1-2021
-10
27-1-2021
200
28-1-2021
-200
29-1-2021
-200
30-1-2021
-200
31-1-2021
-100
31-1-2021
TRUE
28-1-2021
1-2-2021
-122
2-2-2021
-222
3-2-2021
-322

Teradata create date range window based on previous rows

Teradata DB
I am having a rough go at it. I have a dataset and I want to create a customer journey. The rules are that the first transaction is a journey. The next transaction that is at least 30 days out is a journey. The next transaction at least 30 days past that is also a journey. I do not have access to programming, only regular queries.
There are a few scenarios.
Customer has just 1 transaction in the dataset. Since it is the only one, it is flagged as a journey.
Customer has 2 transactions within 5 days. The first one is a journey and the second one is not since it is within 30 days.
Customer has 2 transactions. 1/1 and 2/5. They are > 30 days apart so each is flagged as a journey.
Customer has 3 transactions. 1/1, 1/8, 2/5. The first and third are journeys and the second one is not (since it is within the 30 day window of a previously flagged journey).
I have tried everything, but there always seems to be some scenario that doesn't work.
I have the logic that I can write down, but I can't figure out how to do it in teradata.
If trans_idx=1 then journey flag = y
If date - previous trans_idx date > 30 then journey_flag = Y
This is what I can get right. I can't get the sql right for the following logic. If date - previous trans_idx date < 30 then I need to accumulate the difference and then sum the next row. If it is still < 30 I need to accumulate and sum the next row. Once it gets past 30, I need to set that rows' journey flag to Y.
This works but it only compares the previous row. If I change it to unbounded, it will look at all the rows for the given sequence - i just need it to go back to previous 30 day end.
WHEN RUNNING_SUM_FLOAT=0 THEN 'Y'
WHEN RUNNING_SUM_FLOAT - MIN(RUNNING_SUM_FLOAT)
OVER (partition by sequence_id ORDER BY trans_idx
ROWS BETWEEN 1 PRECEDING and 1 PRECEDING) >=30
THEN 'Y'
ELSE 'N'
END as journey_flag

MDX - Average over the whole time period, even when no data exists

I have a fact table with 1 row for each day. Example:
ID Date Value
1 20190101 10
1 20190102 15
2 20190101 31
If I take a simple Value average in SSAS cube I get:
ID Average <Formula>
1 12.5 (10+15)/2
2 15.5 31/2
As I understand, 15.5 is there because in total there are 2 days in the scope as only two days exist in the fact data when I select the whole month.
However, I need to calculate a monthly average instead. It should check that there are 31 days in that month (based on Date dimension) and get this result:
ID Average <Formula>
1 0.8 (10+15)/31
2 1 31/31
So far I've tried to create some "fake rows" if my data, for example I've tried to create rows with Value = 0 for dates 20190103-20190131 for ID=1.
This works, it forces the calculation for ID=1 to always take all days in the period, but it messes up my other calculations in the cube.
Any other ways to force average calculation in SSAS multidimensional cube to always calculate for the entire month?
If you want to do the calculation in the Cube, you can use the Descendants function on your Date dimension
For eg., the following gives the number of days in a month using the AdventureWorks sample
WITH MEMBER Measures.DayCount AS
Descendants
(
[Date].[Calendar].CurrentMember,
[Date].[Calendar].[Date],
LEAVES
).Count
SELECT [Measures].[DayCount] ON 0,
[Date].[Calendar].[Month].ALLMEMBERS ON 1
FROM [Adventure Works]
I would recommend:
select id, eomonth(date) as eom,
sum(value) * 1.0 / day(eomonth(date)) as average
from t
group by id, eomonth(date);
EOMONTH() returns the last day of the month. You can extract the day to get the number of days in the month.
The * 1.0is because SQL Server does integer division. Your numbers look like integers, but if you are getting 15.5, then you actually have numerics or something other than an integer.

SQLite - Determine average sales made for each day of week

I am trying to produce a query in SQLite where I can determine the average sales made each weekday in the year.
As an example, I'd say like to say
"The average sales for Monday are $400.50 in 2017"
I have a sales table - each row represents a sale you made. You can have multiple sales for the same day. Columns that would be of interest here:
Id, SalesTotal, DayCreated, MonthCreated, YearCreated, CreationDate, PeriodOfTheDay
Day/Month/Year are integers that represent the day/month/year of the week. DateCreated is a unix timestamp that represents the date/time it was created too (and is obviously equal to day/month/year).
PeriodOfTheDay is 0, or 1 (day, or night). You can have multiple records for a given day (typically you can have at most 2 but some people like to add all of their sales in individually, so you could have 5 or more for a day).
Where I am stuck
Because you can have two records on the same day (i.e. a day sales, and a night sales, or multiple of each) I can't just group by day of the week (i.e. group all records by Saturday).
This is because the number of sales you made does not equal the number of days you worked (i.e. I could have worked 10 saturdays, but had 30 sales, so grouping by 'saturday' would produce 30 sales since 30 records exist for saturday (some just happen to share the same day)
Furthermore, if I group by daycreated,monthcreated,yearcreated it works in the sense it produces x rows (where x is the number of days you worked) however that now means I need to return this resultset to the back end and do a row count. I'd rather do this in the query so I can take the sales and divide it by the number of days you worked.
Would anyone be able to assist?
Thanks!
UPDATE
I think I got it - I would love someone to tell me if I'm right:
SELECT COUNT(DISTINCT CAST(( julianday((datetime(CreationDate / 1000, 'unixepoch', 'localtime'))) ) / 7 AS INT))
FROM Sales
WHERE strftime('%w', datetime(CreationDate / 1000, 'unixepoch'), 'localtime') = '6'
AND YearCreated = 2017
This would produce the number for saturday, and then I'd just put this in as an inner query, dividing the sale total by this number of days.
Buddy,
You can group your query by getting the day of week and week number of day created or creation date.
In MSSQL
DATEPART(WEEK,'2017-08-14') // Will give you week 33
DATEPART(WEEKDAY,'2017-08-14') // Will give you day 2
In MYSQL
WEEK('2017-08-14') // Will give you week 33
DAYOFWEEK('2017-08-14') // Will give you day 2
See this figures..
Day of Week
1-Sunday, 2- Monday, 3-Tuesday, 4-Wednesday, 5-Thursday, 6-Saturday
Week Number
1 - 53 Weeks in a year
This will be the key so that you will have a separate Saturday's in every month.
Hope this can help in building your query.

SQL: Calculate the number of Days in a Month with no stock

I am trying to create a query than can calculate the number of days, in a given month, that a particular stock item was unavailable (ie: No. = 0).
Currently, I have developed a query that can calculate the number of days it has been from today's date where stock has been unavailable but what I am trying to actually calculate is, during a month, how many days was stock quantity = 0. ie: Month of Jan - on Jan 5, Jan 7 and Jan 20 there was no stock for Item A - this means that the number of days out of stock was = 3.
Extra Details:
Currently, I am basing my query in determining stock levels of the last transaction (ie: if, at the last transaction, the QTY of Stock = 0) then calculate the number of days between the transaction date and today.
Select [StockItems].StockCode,
Case When SUM([StockItems].Qty_On_Hand)=0 Then (Datediff(day, GETDATE(),MAX([Transactions].TransactionDate))) ELSE 0 END AS 'Days Out of Stock',
From dbo.[Transactions]
INNER JOIN [StockItems]
ON [Transactions].[AccountLink] = [StockItems].[StockLink]
Where [StockItems].StockCode LIKE '%XXX%'
AND [Transactions].TransactionDate>31/10/14
Group By [StockItems].StockCode
My Thoughts
There are different sorts of transactions - one of which is a good received transaction. Perhaps it is possible to calculate the days where Stock Qty was zero and a transaction occurred then count that date until goods were received.
Thoughts?
Thank You.
SELECT COUNT([StockItems].Qty_On_Hand
From dbo.[Transactions]
INNER JOIN [StockItems] ON [Transactions].[AccountLink] = [StockItems].[StockLink]
WHERE [StockItems].Qty_On_Hand)=0