Period over period SQL script? - sql

I have a dataset of 2 columns: 'Date' and 'Total Sales'. My dates are 01-01-2021, 02-01-2021... so on and so forth up until 12-01-2022. I basically want to add another row where I have a "previous month" column that gives me the total sales for the previous month in the same row as the current month (else null) -- e.g. say I have 2 rows in my date column 01-01-2021 and 02-01-2021 and total sales would be $10 and $20 respectively. How do can I create a column that would show the following:
Date |Sales | Previous Month Sales|
---------------------------------------------
01-01-2021 | $10 | null
02-01-2021 | $20 | $10
So on and so forth; this is my query:
CASE
WHEN `Date` > DATE_SUB(`Date`, INTERVAL 1 MONTH)
THEN `Monthly Sales`
ELSE 'null'
END
Thanks in advance

Well, Domo's back-end is running a MySQL back-engine (from what I recall the last time I touched Domo [2018])
I think this is just a SQL question, and I wonder if a simple windowing function would do the trick.
select Date,
Sales,
max (case when *month* = *this month -1* then Sales else null end) over (order by 1) as "Previous Month Sales"
from table
You just need to figure out how to break down the Date into the month based on whatever SQL dialect Domo uses nowadays.
Cheers

I think domo support MySQL-like language, so you could do something like this:
with cte as
(
select date,
date + interval 1 month as next_month,
sales
from sales
)
select a.date,
a.sales as current_sales,
b.sales as prior_month_sales
from sales a
left join cte b
on b.next_month = a.date
order by a.date

I do this by joining the table onto itself with a LEFT OUTER JOIN. The outer join allows you to keep the null value for previous month. You match the date such that 1 column is calculated to show the previous month (I do this with EOMONTH() to ensure I always get the previous month and account for the year, if say it is January).
IF OBJECT_ID('TEMPDB..#TEMP') IS NOT NULL
DROP TABLE #TEMP
CREATE TABLE #TEMP(
[Date] DATE
,[Sales] INT
)
INSERT INTO #TEMP([Date],[Sales])
VALUES ('2020-12-20',50)
,('2021-01-20',100)
,('2021-02-20',200)
,('2021-03-20',300)
,('2021-04-20',400)
,('2021-05-20',500)
SELECT #TEMP.[Date]
,#TEMP.Sales
,TEMPII.Date [PREV M]
,TEMPII.Sales [PREV M SALES]
FROM #TEMP
LEFT OUTER JOIN #TEMP TEMPII
ON YEAR(EOMONTH(#TEMP.[Date],-1))*100+MONTH(EOMONTH(#TEMP.[Date],-1)) = YEAR(TEMPII.[Date])*100+MONTH(TEMPII.[Date])
ORDER BY #TEMP.[Date]
Output:

Related

SQL reflect data for all days

I am trying to create a table in SQL where I reflect data for all days of a particular month.
For example, even if there is no Sale transaction for a particular day, the day for the particular employee should still be reflective in the table.
Scenario for the question. I am working with the following dates for this example: 1st, 2nd, 3rd and 4th Jan 2022.
The raw data is as follows:
Name
Date
SaleAmount
John
2022-01-01
154875
John
2022-01-03
598752
As seen above, we only have data for the 1st and 3rd.
The outcome should look as follows:
Name
Date
SaleAmount
John
2022-01-01
154875
John
2022-01-02
NULL
John
2022-01-03
598752
John
2022-01-04
NULL
As seen above, the 2nd and 4th should be included even though there was no activity for those days.
What I am currently trying:
I have a master date table which is being used as a RIGHT JOIN on the transaction table. However, the final outcome of my table is as follows:
Name
Date
SaleAmount
John
2022-01-01
154875
NULL
2022-01-02
NULL
John
2022-01-03
598752
NULL
2022-01-04
NULL
As seen above, the 'Name' field returns as NULL. The SaleAmount however should reflect NULL to indicate no transactions happening.
I would appreciate any assistance on this.
Seems like you want to
Start with the date table
Cross join to your employee/salesperson table so you now have one row for each salesperson on each date
Left join the sales orders for that date + salesperson combo to get the sum of their sales for that day. If they have none, it'll show null:
select emp.Name
,dat.Date
,sum(ord.Amount) as SaleAmount
from dateList dat
cross join salesPerson emp
left join salesOrder ord on ord.OrderDate = dat.Date and ord.SalesPersonId = emp.SalesPersonId
group by emp.Name
,dat.Date
you can create a list of dates on the fly. as example per month
Declare #year int = 2022, #month int = 7;
WITH numbers
as
(
Select 1 as value
UNion ALL
Select value + 1 from numbers
where value + 1 <= Day(EOMONTH(datefromparts(#year,#month,1)))
)
SELECT datefromparts(#year,#month,numbers.value) Datum FROM numbers
then left join to your table.
You may use a Recursive CTE as the following:
Declare #startDate as date ='2022-01-01';
Declare #endDate as date ='2022-01-31';
With CTE As
(
Select Distinct Name_ nm,#startDate dt From SalesDates
Where Date_ Between #startDate And #endDate
Union All
Select nm, DateAdd(Day,1,dt) From CTE
Where DateAdd(Day,1,dt)<=#endDate
)
Select C.nm as [Name],C.dt as [Date], S.SaleAmount
From CTE C Left Join SalesDates S
On S.Date_=C.dt
And S.Name_=C.nm
Order By C.nm,C.dt
You can change the values of #startDate and #endDate according to the period you want.
See a demo from db<>fiddle.

How to spread annual amount and then add by month in SQL

Currently I'm working with a table that looks like this:
Month | Transaction | amount
2021-07-01| Annual Membership Fee| 45
2021-08-01| Annual Membership Fee| 145
2021-09-01| Annual Membership Fee| 2940
2021-10-01| Annual Membership Fee| 1545
the amount on that table is the total monthly amount (ex. I have 100 customers who paid $15 for the annual membership, so my total monthly amount would be $1500).
However what I would like to do (and I have no clue how) is divide the amount by 12 and spread it into the future in order to have a monthly revenue per month. As an example for 2021-09-01 I would get the following:
$2490/12 = $207.5 (dollars per month for the next 12 months)
in 2021-09-01 I would only get $207.5 for that specific month.
On 2021-10-01 I would get $1545/12 = $128.75 plus $207.5 from the previous month (total = $336.25 for 2021-10-01)
And the same operation would repeat onwards. The last period that I would collect my $207.5 from 2021-09-01 would be in 2022-08-01.
I was wondering if someone could give me an idea of how to perform this in a SQL query/CTE?
Assuming all the months you care about exist in your table, I would suggest something like:
SELECT
month,
(SELECT SUM(m2.amount/12) FROM mytable m2 WHERE m2.month BETWEEN ADD_MONTHS(m1.month, -11) AND m1.month) as monthlyamount
FROM mytable m1
GROUP BY month
ORDER BY month
For each month that exists in the table, this sums 1/12th of the current amount plus the previous 11 months (using the add_months function). I think that's what you want.
A few notes/thoughts:
I'm assuming (based on the column name) that all the dates in the month column end on the 1st, so we don't need to worry about matching days or having the group by return multiple rows for the same month.
You might want to round the SUMs I did, since in some cases dividing by 12 might give you more digits after the decimal than you want for money (although, in that case, you might also have to consider remainders).
If you really only have one transaction per month (like in your example), you don't need to do the group by.
If the months you care about don't exist in your table, then this won't work, but you could do the same thing generating a table of months. e.g. If you have an amount on 2020-01-01 but nothing in 2020-02-01, then this won't return a row for 2021-02-01.
CTE = set up dataset
CTE_2 = pro-rate dataset
FINAL SQL = select future_cal_month,sum(pro_rated_amount) from cte_2 group by 1
with cte as (
select '2021-07-01' cal_month,'Annual Membership Fee' transaction ,45 amount
union all select '2021-08-01' cal_month,'Annual Membership Fee' transaction ,145 amount
union all select '2021-09-01' cal_month,'Annual Membership Fee' transaction ,2940 amount
union all select '2021-10-01' cal_month,'Annual Membership Fee' transaction ,1545 amount)
, cte_2 as (
select
dateadd('month', row_number() over (partition by cal_month order by 1), cal_month) future_cal_month
,amount/12 pro_rated_amount
from
cte
,table(generator(rowcount => 12)) v)
select
future_cal_month
, sum(pro_rated_amount)
from
cte_2
group by
future_cal_month

SQL monthly rolling sum

I am trying to calculate monthly balances of bank accounts from the following postgresql table, containing transactions:
# \d transactions
View "public.transactions"
Column | Type | Collation | Nullable | Default
--------+------------------+-----------+----------+---------
year | double precision | | |
month | double precision | | |
bank | text | | |
amount | numeric | | |
In "rolling sum" I mean that the sum should contain the sum of all transactions until the end of the given month from the beginning of time, not just all transactions in thegiven month.
I came up with the following query:
select
a.year, a.month, a.bank,
(select sum(b.amount) from transactions b
where b.year < a.year
or (b.year = a.year and b.month <= a.month))
from
transactions a
order by
bank, year, month;
The problem is that this contains as many rows for each of the months for each banks as many transactions were there. If more, then more, if none, then none.
I would like a query which contains exactly one row for each bank and month for the whole time interval including the first and last transaction.
How to do that?
An example dataset and a query can be found at https://rextester.com/WJP53830 , courtesy of #a_horse_with_no_name
You need to generate a list of months first, then you can outer join your transactions table to that list.
with all_years as (
select y.year, m.month, b.bank
from generate_series(2010, 2019) as y(year) --<< adjust here for your desired range of years
cross join generate_series(1,12) as m(month)
cross join (select distinct bank from transactions) as b(bank)
)
select ay.*, sum(amount) over (partition by ay.bank order by ay.year, ay.month)
from all_years ay
left join transactions t on (ay.year, ay.month, ay.bank) = (t.year::int, t.month::int, t.bank)
order by bank, year, month;
The cross join with all banks is necessary so that the all_years CTE will also contain a bank for each month row.
Online example: https://rextester.com/ZZBVM16426
Here is my suggestion in Oracle 10 SQL:
select a.year,a.month,a.bank, (select sum(b.amount) from
(select a.year as year,a.month as month,a.bank as bank,
sum(a.amount) as amount from transactions c
group by a.year,a.month,a.bank
) b
where b.year<a.year or (b.year=a.year and b.month<=a.month))
from transactions a order by bank, year, month;
Consider aggregating all transactions first by bank and month, then run a window SUM() OVER() for rolling monthly sum since earliest amount.
WITH agg AS (
SELECT t.year, t.month, t.bank, SUM(t.amount) AS Sum_Amount
FROM transactions t
GROUP BY t.year, t.month, t.bank
)
SELECT agg.year, agg.month, agg.bank,
SUM(agg.Sum_Amount) OVER (PARTITION BY agg.bank ORDER BY agg.year, agg.month) AS rolling_sum
FROM agg
ORDER BY agg.year, agg.month, agg.bank
Should you want YTD rolling sums, adjust the OVER() clause by adding year to partition:
SUM(agg.Sum_Amount) OVER (PARTITION BY agg.bank, agg.year ORDER BY agg.month)

T-Sql Cartesian Join to Fill Dates and Join to CTE to Get Most Recent

Our current method of calculating out of stock no longer works for how we track inventory and how we need to view the data. The new method I want to use is to only look at the final OnHandAfter value for each day in the trailing year. We are not 24/7 so the last value entered at the end of each day will tell us if the item was in/out of stock that day. If an item has no inventory transactions for a date it should use the previous found date.
My current query does a cross join of all out items (I currently have
it set to a single item for testing) and a calendar table. This give
me 365 days for each item. This is working.
My cte query returns the final OnHandAfter for each date there was a
transaction. This is working if run by itself.
With the <= date condition commented out I get 365 rows returned but
dates from the cte are NULL. If the condition is not commented out 0
rows are returned.
Note, the next step is to include the OnHandAfter field but for now I
can't seem to get the cte to connect.
ABDailyCalendar abdc
This is a table prefilled with every date in the trailing year
Sample Inventory Data (what the cte returns for single item if run by itself, I left out some columns for brevity)
ItemCode TransactionDate OnHandAfter rn
Item-123 10/1/2018 960 1
Item-123 9/28/2018 985 1
Item-123 9/27/2018 1085 1
Item-123 9/26/2018 1485 1
Item-123 9/24/2018 1835 1
Item-123 9/20/2018 2035 1
Item-123 9/18/2018 2185 1
Item-123 9/14/2018 2305 1
Item-123 9/13/2018 2605 1
My Query
with cte as
(
Select TOP 1 * from
(
Select
ItemCode
,convert(Date,TransactionDate) TransactionDate
,TransactionType
,TransactionQuantity
,OnHandBefore
,OnHandAfter
,ROW_NUMBER() over (partition by ItemCode, CONVERT(Date, TransactionDate) order by TransactionDate DESC) as rn
from InventoryTransaction
where TransactionType in (1,2,4,8)
) as ss
where rn = 1
order by TransactionDate DESC
)
SELECT
ab.ExternalId
,abdc.[Date]
,cte.TransactionDate
From ABItems ab CROSS JOIN ABDailyCalendar abdc
FULL OUTER JOIN cte on cte.ItemCode = ab.ExternalId --and cte.TransactionDate <= abdc.[Date]
Where ab.ExternalID = 'Item-123'
order by abdc.[Date] DESC
Current Sample Results
ExternalId Date TransactionDate
Item-123 9/30/2018 NULL
Item-123 9/29/2018 NULL
Item-123 9/28/2018 NULL
Item-123 9/27/2018 NULL
Item-123 9/26/2018 NULL
Item-123 9/25/2018 NULL
Item-123 9/24/2018 NULL
Desired Results
ExternalId Date TransactionDate
Item-123 9/30/2018 9/28/2018
Item-123 9/29/2018 9/28/2018
Item-123 9/28/2018 9/28/2018
Item-123 9/27/2018 9/27/2018
Item-123 9/26/2018 9/26/2018
Item-123 9/25/2018 9/24/2018
Item-123 9/24/2018 9/24/2018
The TransactionDate should be the most recent TransactionDate that is <= to the Date.
If it matters - I am running SSMS 2012 connected to SQL Server 2008.
Any pointers or ideas will be greatly appreciated. I have stared at it so long that nothing new is coming to me. Thanks.
I used postgres but it's broadly the same as SQLS for this operation. Here's an impl of what i wrote in my comment:
https://www.db-fiddle.com/f/uKcgh9yZVvvqfRWTERv2a3/0
We make some sample data on the left side of the fiddle. This is PG specific but shouldn't matter too much - end result is it gets to the same place you are with your data in SQLS
Then the query:
SELECT
itemcode,
caldate,
case when caldate = transactiondate then onhandafter else prev_onhandafter end as onhandat,
case when caldate = transactiondate then 'tran occurred today, using current onhandafter' else 'no tran today, using previous onhandafter' end as reasoning,
transactiondate,
onhandafter,
prev_onhandafter
FROM
(
SELECT
itemcode,
transactiondate,
LAG(transactiondate) over(partition by itemcode order by transactiondate) as prev_transactiondate,
onhandafter,
LAG(onhandafter) over(partition by itemcode order by transactiondate) as prev_onhandafter
FROM
t
) t2
INNER JOIN
c
ON
c.caldate > t2.prev_transactiondate and c.caldate <= t2.transactiondate
ORDER BY itemcode, caldate
itemcode/externalid (you called it both)
Bunch of dates - whether your dates are DATE or DATETIME they're comparable. No harm in casting a DATETIME to a DATE if you want, and if any of your dates contain a time component it may well be vital to do so, because 2018-01-01 00:00 is not the same as 2018-01-01 01:00, and if your calendar table has midnight, and the transactiondate is 1 am, then the range join condition (caldate > prevtrandate and caldate <= trandate) won't work out properly. Feel free to cast as part of the join: caldate > CAST(prevtrandate as DATE) and caldate <= CAST(trandate as DATE). If your datetimes are 100% guaranteed to be exactly bang on midnight (to the microsecond) then the join will work out without casting - casting here is a quick trick to strip the time off and ensure apples are comparing to apples
OK, so how this works:
Rather than number the rows in the table in a cte and join it to itself I used a similar technique using LAG to get the previous row's values I'm interested in. Previous here is defined as "per item code, in ascending order of trandate". This gives us rows that have a current trandate, a previous trandate (note: null for the first row, some extra fiddling with the query, like COALESCE(lag(...), trandate) will be required if it is to be kept otherwise it will disappear when joined) a previous onhand. We''l use the date pair to join and we'll later choose whether to present the current or previous onhand.
This is done as a subquery so that the prev values become available to use. It is joined to the calendar table on the calendar date being greater than the prev trandate and less than or equal to the current trandate. This means that a cartesian product fills in all the gaps in the transaction dates so we get a contiguous set of dates out of the calendar table.
We use a case when to examine the data - if the cal date is equal to the tran date, we use the new value for onhand because a tran occurred today and decremented the stock. Otherwise we can assert that there was no transaction today, and we should use the prev onhand instead.
Hopefully this is all the right way round with regards to what you want (you seemed to indicate that it was onhandafter that you actually wanted, but your desired query output only mentioned the transaction date/caldate pair
Edit: ok, lag isn't available- here's a solution that uses rownumber:
https://www.db-fiddle.com/f/2ooVrNF18stUQAa4HyTj6r/0
WITH cte AS(
SELECT
itemcode,
transactiondate,
ROW_NUMBER() over(partition by itemcode order by transactiondate) as rown,
onhandafter
FROM
t
)
SELECT
curr.itemcode,
c.caldate,
case when c.caldate = curr.transactiondate then curr.onhandafter else prev.onhandafter end as onhandat,
case when c.caldate = curr.transactiondate then 'tran occurred today, using current onhandafter' else 'no tran today, using previous onhandafter' end as reasoning,
curr.transactiondate,
curr.onhandafter,
prev.onhandafter
FROM
cte curr
INNER JOIN
cte prev
ON curr.rown = prev.rown + 1 and curr.itemcode = prev.itemcode
INNER JOIN
c
ON
c.caldate > prev.transactiondate and c.caldate <= curr.transactiondate
ORDER BY curr.itemcode, c.caldate
How it works: pretty much as my first comment. We partition the table into itemcode and rownumber in order of tran date. We do this as a cte so it's cleaner to say from cte curr inner join cte prev on curr.rownumber = prev.rownumber+1
Lag is thus simulated- we have a row that has current values and previous values. The rest of the query logic remains the same from above

Calculating business days in Teradata

I need help in business days calculation.
I've two tables
1) One table ACTUAL_TABLE containing order date and contact date with timestamp datatypes.
2) The second table BUSINESS_DATES has each of the calendar dates listed and has a flag to indicate weekend days.
using these two tables, I need to ensure business days and not calendar days (which is the current logic) is calculated between these two fields.
My thought process was to first get a range of dates by comparing ORDER_DATE with TABLE_DATE field and then do a similar comparison of CONTACT_DATE to TABLE_DATE field. This would get me a range from the BUSINESS_DATES table which I can then use to calculate count of days, sum(Holiday_WKND_Flag) fields making the result look like:
Order# | Count(*) As DAYS | SUM(WEEKEND DATES)
100 | 25 | 8
However this only works when I use a specific order number and cant' bring all order numbers in a sub query.
My Query:
SELECT SUM(Holiday_WKND_Flag), COUNT(*) FROM
(
SELECT
* FROM
BUSINESS_DATES
WHERE BUSINESS.Business BETWEEN (SELECT ORDER_DATE FROM ACTUAL_TABLE
WHERE ORDER# = '100'
)
AND
(SELECT CONTACT_DATE FROM ACTUAL_TABLE
WHERE ORDER# = '100'
)
TEMP
Uploading the table structure for your reference.
SELECT ORDER#, SUM(Holiday_WKND_Flag), COUNT(*)
FROM business_dates bd
INNER JOIN actual_table at ON bd.table_date BETWEEN at.order_date AND at.contact_date
GROUP BY ORDER#
Instead of joining on a BETWEEN (which always results in a bad Product Join) followed by a COUNT you better assign a bussines day number to each date (in best case this is calculated only once and added as a column to your calendar table). Then it's two Equi-Joins and no aggregation needed:
WITH cte AS
(
SELECT
Cast(table_date AS DATE) AS table_date,
-- assign a consecutive number to each busines day, i.e. not increased during weekends, etc.
Sum(CASE WHEN Holiday_WKND_Flag = 1 THEN 0 ELSE 1 end)
Over (ORDER BY table_date
ROWS Unbounded Preceding) AS business_day_nbr
FROM business_dates
)
SELECT ORDER#,
Cast(t.contact_date AS DATE) - Cast(t.order_date AS DATE) AS #_of_days
b2.business_day_nbr - b1.business_day_nbr AS #_of_business_days
FROM actual_table AS t
JOIN cte AS b1
ON Cast(t.order_date AS DATE) = b1.table_date
JOIN cte AS b2
ON Cast(t.contact_date AS DATE) = b2.table_date
Btw, why are table_date and order_date timestamp instead of a date?
Porting from Oracle?
You can use this query. Hope it helps
select order#,
order_date,
contact_date,
(select count(1)
from business_dates_table
where table_date between a.order_date and a.contact_date
and holiday_wknd_flag = 0
) business_days
from actual_table a