Translating SQL with join and window function to DAX - sql

I have a working SQL query, but it needs to be translated to DAX and I'm struggling.
This is what I'm trying to achieve:
I have agreements running from a startdate to an enddate in a fact table. An agreement running for a whole year counts as 1, meaning DATEDIFF(startdate, enddate) / 365.0 gives the "weight" of the agreement. It is needed to look at any given month and get the sum of the trailing year's total agreement weights. I also have a dimension table related to the fact table with all dates (single days and in which year/month they belong), which gives me the possibility to perform the following SQL query to get exactly what I want:
SELECT
sub.yearMonth
,SUM(sub.[cnt]) OVER (ORDER BY sub.[yearMonth] DESC ROWS BETWEEN 11 PRECEDING AND CURRENT ROW) / 365.0 AS [runningYear]
FROM
(SELECT
d.yearMonth
,COUNT(*) AS [cnt]
FROM Agreememts AS a
INNER JOIN Date AS d
ON d.date >= a.startddate AND d.date <= a.enddate
GROUP BY d.yearMonth
) AS sub
I tried to replicate this in a DAX measure ending up with the following:
AgreementWeights:=
CALCULATE (
CALCULATE (
COUNTROWS ( 'Agreements' );
FILTER (
'Agreements';
'Agreements'[startdate] <= MAX ( 'Date'[date] )
&& 'Agreements'[enddate] >= EDATE ( MAX( 'Date'[date] ); -12)
)
);
CROSSFILTER ( 'Agreements'[receiveddate]; 'Date'[sys_date_key]; NONE )
)
The last line is to cut the relationship from the recieveddate to the date dimension, which is irrelevant here. This DAX query yields too many rows, but gives the correct result when divided like (result/365)*100 and I simply cannot figure out why.
An example of the fact table Agreements:
ID startdate enddate recieveddate
0 10-04-2014 12-06-2015 10-03-2014
1 11-06-2014 11-07-2014 11-05-2014
An example of the dimension table Date:
ID date yearMonth sys_date_key
0 10-04-2014 April2014 10042014
1 11-04-2014 April2014 11042014
Thanks :)

Related

SQL - Get historic count of rows collected within a certain period by date

For many years I've been collecting data and I'm interested in knowing the historic counts of IDs that appeared in the last 30 days. The source looks like this
id
dates
1
2002-01-01
2
2002-01-01
3
2002-01-01
...
...
3
2023-01-10
If I wanted to know the historic count of ids that appeared in the last 30 days I would do something like this
with total_counter as (
select id, count(id) counts
from source
group by id
),
unique_obs as (
select id
from source
where dates >= DATEADD(Day ,-30, current_date)
group by id
)
select count(distinct(id))
from unique_obs
left join total_counter
on total_counter.id = unique_obs.id;
The problem is that this results would return a single result for today's count as provided by current_date.
I would like to see a table with such counts as if for example I had ran this analysis yesterday, and the day before and so on. So the expected result would be something like
counts
date
1235
2023-01-10
1234
2023-01-09
1265
2023-01-08
...
...
7383
2022-12-11
so for example, let's say that if the current_date was 2023-01-10, my query would've returned 1235.
If you need a distinct count of Ids from the 30 days up to and including each date the below should work
WITH CTE_DATES
AS
(
--Create a list of anchor dates
SELECT DISTINCT
dates
FROM source
)
SELECT COUNT(DISTINCT s.id) AS "counts"
,D.dates AS "date"
FROM CTE_DATES D
LEFT JOIN source S ON S.dates BETWEEN DATEADD(DAY,-29,D.dates) AND D.dates --30 DAYS INCLUSIVE
GROUP BY D.dates
ORDER BY D.dates DESC
;
If the distinct count didnt matter you could likely simplify with a rolling sum, only hitting the source table once:
SELECT S.dates AS "date"
,COUNT(1) AS "count_daily"
,SUM("count_daily") OVER(ORDER BY S.dates DESC ROWS BETWEEN CURRENT ROW AND 29 FOLLOWING) AS "count_rolling" --assumes there is at least one row for every day.
FROM source S
GROUP BY S.dates
ORDER BY S.dates DESC;
;
This wont work though if you have gaps in your list of dates as it'll just include the latest 30 days available. In which case the first example without distinct in the count will do the trick.
SELECT count(*) AS Counts
dates AS Date
FROM source
WHERE dates >= DATEADD(DAY, -30, CURRENT_DATE)
GROUP BY dates
ORDER BY dates DESC

Looking to Aggregate 3 Counts of Data as columns by distinct account number in a given list. All of my data is in the same table

Here is the basic of what I am trying to do in pseudo code. All of my data I need is in the same table.
SELECT DISTINCT ACCOUNT_NUMBER
, COUNT(INVOICES FOR INV_DATE WITHN 2022)
, COUNT(INVOICES FOR INV_DATE WITHIN 2021)
, COUNT(INVOICES FOR INV_DATE WITHIN 2020)
FROM SALES_DATA
WHERE ACCOUNT_NUMBER IN ('987987','98845','966554').
I can easily get the first columns, but joining the additional years I am struggling.
You can use PIVOT after collapsing the invoices to just the account number and year:
; -- previous statement terminator sqlblog.org/cte
WITH cte AS
(
SELECT ACCOUNT_NUMBER, y = DATEPART(YEAR, INV_DATE)
FROM dbo.SALES_DATA
WHERE INV_DATE >= '20200101'
-- AND ACCOUNT_NUMBER IN (some,list)
)
SELECT * FROM cte
PIVOT
(
COUNT(y) FOR y IN ([2020],[2021],[2022])
) AS p;
Example db<>fiddle

Rolling 12 month filter criteria in SQL

Having an issue in SQL script where I’m trying to achieve filter criteria of rolling 12 months in the day column which stored data as a text in server.
Goal is to count sizes for product at retail store location over the last 12 months from the current day. Currently, in my query I'm using the criteria of year 2019 which only counts the sizes for that year but not for rolling 12 months from current date.
CALENDARDAY column is in text field in the data set and data stores in yyyymmdd format.
When trying to run below script in Tableau with GETDATE and DATEADD function it is giving me a functional error. I am trying to access SAP HANA server with below query.
Any help would be appreciated
Select
SKU, STYLE_ID, Base_Style_ID, COLOR, SIZEKEY, STORE, Year,
count(SIZEKEY)over(partition by STYLE_ID,COLOR,STORE,Year) as SZ_CNT
from
(
select
a."RAW" As SKU,
a."STYLENUM" As STYLE_ID,
mat."BASENUM" AS Base_Style_ID,
a."COLORNUM" AS COLOR,
a."SIZE" AS SIZEKEY,
a."STORENUM" AS STORE,
substring(a."CALENDARDAY",1,4) As year
from PRTRPT_XRE as a
JOIN ZAT_SKU As mat On a."RAW" = mat."SKU"
where a."ORGANIZATION" = 'M20'
and a."COLORNUM" is not null
and substring(a."CALENDARDAY",1,4) = '2019'
Group BY
a."RAW",
a."STYLENUM",
mat."BASENUM",
a."ZCOLORCD",
a."SIZE",
a."STORENUM",
substring(a."CALENDARDAY",1,4)
)
I have never worked on that DB / Server, so I don't have a way to test this.
But hopefully this will work (expecting exact 12 months before today's date)
AND ADD_MONTHS (TO_DATE (a."CALENDARDAY", 'YYYY-MM-DD'), 12) > CURRENT_DATE
or
AND ADD_MONTHS (a."CALENDARDAY", 12) > CURRENT_DATE
Below condition from one of our CALENDAR table also worked same way as ADD_MONTHS mentioned in above response
select distinct CALENDARDAY
from
(
select FISCALWEEK, CALENDARDAY, CNST, row_number()over(partition by CNST order by FISCALWEEK desc) as rnum
from
(
select distinct FISCALWEEK, CALENDARDAY, 'A' as CNST
from CALENDARTABLE
where CALENDARDAY < current_date
order by 1,2
)
) where rnum < 366

sum based on max production date and min production date MTD,WTD, YTD SQL Server

Hello I am trying to create a automated query that displays month to date, year to date, and week to date and creates a column for each. I need to sum balance amount if the production date is the maximum - the minimum production date sum of deposits. This will give me a YTD column. I also need to do month to date and week to date if anyone has any ideas. Any help with this would be appreciated. Thanks!
P.S. I am using microsoft sql server management studio
Here is what I have so far:
select SUM([curr_bal_amt]) as total_amt , [prod_dt] as date123
from [dbo].[DEPOSIT_TEST]
group by [prod_dt];
this results in a chart like:
Overall I need to calculate year to date as subtracting the max date i have minus the min date i have. Later on when i import more data i need to do mtd and wtd. Thanks
Edit: I am looking to use my current table so maybe it would help to edit this table as I forgot to mention that I have 3 day gaps in data.
-also for my prod_dt column i have multiple balances that I must sum if the prod_dt is the same. Is there a simple query to just subtract the most recent date's sum of curr_balance amt - the first date of the last month's sum of curr_balance amt. Thanks for your help Shawn it is greatly appreciated!
this is an example of one of my data imports for one of my days
Please if you could use the names of my columns it would be very beneficial so that I could learn better. Thank you! the name of my table is Deposit_Test and the column names are just like the ones in the picture. Thank you again
This should give you a good idea of how to get at those totals. I don't know what other data you're after in your tables, but you should be able to modify the below query to get at it.
SQL Fiddle
MS SQL Server 2017 Schema Setup:
/********************************CALENDAR********************************/
/*
My original answer made use of a Calendar Table, but I realized it
was overkill for this situation. I still think every database should
have both a Calendar Table and a Numbers Table. They are both very
useful. I use the ct here just to populate my test table, but I've
left some very basic creation to show you how it can be done. Calcs
done here allow your final query to JOIN to it and avoid RBAR to be
more set-based, and save a lot of processing for large tables.
NOTE: This original date table concept is from Aaron Bertrand.
*/
CREATE TABLE datedim (
theDate date PRIMARY KEY
, theDay AS DATEPART(day, theDate) --int
, theWeek AS DATEPART(week, theDate) --int
, theMonth AS DATEPART(month, theDate) --int
, theYear AS DATEPART(year, theDate) --int
, yyyymmdd AS CONVERT(char(8), theDate, 112) /* yyyymmdd */
);
/************************************************************************/
/*
Use the catalog views to generate as many rows as we need. This example
creates a date dimension for all of 2018.
*/
INSERT INTO datedim ( theDate )
SELECT d
FROM (
SELECT d = DATEADD(day, rn - 1, '20180101')
FROM
(
SELECT TOP (DATEDIFF(day, '20180101', '20190101'))
rn = ROW_NUMBER() OVER (ORDER BY s1.object_id)
FROM sys.all_objects AS s1
CROSS JOIN sys.all_objects AS s2
ORDER BY s1.object_id
) AS x
) AS y;
/************************************************************************/
/***** TEST TABLE SETUP *****/
CREATE TABLE t1 ( id int identity, entryDate date, cnt int) ;
INSERT INTO t1 (entryDate, cnt)
SELECT theDate, 2
FROM datedim
;
/* Remove a few "random" records to test our counts. */
DELETE FROM t1
WHERE datePart(day,entryDate) IN (10,6,14,22) OR datepart(month,entryDate) = 6
;
Main Query:
/* Make sure the first day or our week is consistent. */
SET DATEFIRST 7 ; /* SUNDAY */
/* Then build out our query needs with CTEs. */
; WITH theDate AS (
SELECT d.dt FROM ( VALUES ( '2018-05-17' ) ) d(dt)
)
, base AS (
SELECT t1.entryDate
, t1.cnt
, theDate.dt
, datepart(year,theDate.dt) AS theYear
, datepart(month,theDate.dt) AS theMonth
, datepart(week,theDate.dt) AS theWeek
FROM t1
CROSS APPLY theDate
WHERE t1.EntryDate <= theDate.dt
AND datePart(year,t1.EntryDate) = datePart(year,theDate.dt)
)
/* Year-to-date totals */
, ytd AS (
SELECT b.theYear, sum(cnt) AS s
FROM base b
GROUP BY b.theYear
)
/* Month-to-date totals */
, mtd AS (
SELECT b2.theYear, b2.theMonth, sum(cnt) AS s
FROM base b2
WHERE b2.theMonth = datePart(month,b2.EntryDate)
GROUP BY b2.theYear, b2.theMonth
)
/* Week-to-date totals */
, wtd AS (
SELECT b3.theYear, b3.theMonth, sum(cnt) AS s
FROM base b3
WHERE b3.theWeek = datePart(week,b3.EntryDate)
GROUP BY b3.theYear, b3.theMonth
)
SELECT blah = 'CountRow'
, ytd.s AS ytdAmt
, mtd.s AS mtdAmt
, wtd.s AS wtdAmt
FROM ytd
CROSS APPLY mtd
CROSS APPLY wtd
Results:
| blah | ytdAmt | mtdAmt | wtdAmt |
|----------|--------|--------|--------|
| CountRow | 236 | 28 | 8 |
Again, the data that you need to get will likely change the overall query, but this should point in the right direction. You can use each CTE to verify the YTD, MTD and WTD totals.

Calculating business days in Teradata

I need help in business days calculation.
I've two tables
1) One table ACTUAL_TABLE containing order date and contact date with timestamp datatypes.
2) The second table BUSINESS_DATES has each of the calendar dates listed and has a flag to indicate weekend days.
using these two tables, I need to ensure business days and not calendar days (which is the current logic) is calculated between these two fields.
My thought process was to first get a range of dates by comparing ORDER_DATE with TABLE_DATE field and then do a similar comparison of CONTACT_DATE to TABLE_DATE field. This would get me a range from the BUSINESS_DATES table which I can then use to calculate count of days, sum(Holiday_WKND_Flag) fields making the result look like:
Order# | Count(*) As DAYS | SUM(WEEKEND DATES)
100 | 25 | 8
However this only works when I use a specific order number and cant' bring all order numbers in a sub query.
My Query:
SELECT SUM(Holiday_WKND_Flag), COUNT(*) FROM
(
SELECT
* FROM
BUSINESS_DATES
WHERE BUSINESS.Business BETWEEN (SELECT ORDER_DATE FROM ACTUAL_TABLE
WHERE ORDER# = '100'
)
AND
(SELECT CONTACT_DATE FROM ACTUAL_TABLE
WHERE ORDER# = '100'
)
TEMP
Uploading the table structure for your reference.
SELECT ORDER#, SUM(Holiday_WKND_Flag), COUNT(*)
FROM business_dates bd
INNER JOIN actual_table at ON bd.table_date BETWEEN at.order_date AND at.contact_date
GROUP BY ORDER#
Instead of joining on a BETWEEN (which always results in a bad Product Join) followed by a COUNT you better assign a bussines day number to each date (in best case this is calculated only once and added as a column to your calendar table). Then it's two Equi-Joins and no aggregation needed:
WITH cte AS
(
SELECT
Cast(table_date AS DATE) AS table_date,
-- assign a consecutive number to each busines day, i.e. not increased during weekends, etc.
Sum(CASE WHEN Holiday_WKND_Flag = 1 THEN 0 ELSE 1 end)
Over (ORDER BY table_date
ROWS Unbounded Preceding) AS business_day_nbr
FROM business_dates
)
SELECT ORDER#,
Cast(t.contact_date AS DATE) - Cast(t.order_date AS DATE) AS #_of_days
b2.business_day_nbr - b1.business_day_nbr AS #_of_business_days
FROM actual_table AS t
JOIN cte AS b1
ON Cast(t.order_date AS DATE) = b1.table_date
JOIN cte AS b2
ON Cast(t.contact_date AS DATE) = b2.table_date
Btw, why are table_date and order_date timestamp instead of a date?
Porting from Oracle?
You can use this query. Hope it helps
select order#,
order_date,
contact_date,
(select count(1)
from business_dates_table
where table_date between a.order_date and a.contact_date
and holiday_wknd_flag = 0
) business_days
from actual_table a