Max partition by DAX measure equivalent? - sql

In DAX/Power BI, I wondering if it is possible to create an aggregate calculation on a subset of data within a dataset.
I have a listing of customer scores for a period of time, e.g.
date, customer, score
-----------------------
1.1.17, A, 12
2.1.17, A, 16
4.1.17, B, 10
5.1.17, B, 14
I would like to identify to Max date per customer eg.
date, customer, score, max date per client
-------------------------------------------
1.1.17, A, 12, 2.1.17
2.1.17, A, 11, 2.1.17
4.1.17, B, 10, 5.1.17
5.1.17, B, 14, 5.1.17
The SQL equivalent would something like-
MAX(date) OVER (PARTITION BY customer).
In DAX/Power BI I realise that a calculated column can be used in combination with EARLIER but this will not be suitable because the calculated column is not responsive to filtering from a slicer. I.e I would like to find the MAX date per client as illustrated above for a filtered date range controlled from a slicer and not for the full data set which is what a calculated column does. Is such a measure possible?

You will want a measure like this:
Max Date by Customer =
CALCULATE(
MAX(Table1[Date]),
FILTER(
ALLSELECTED(Table1),
Table1[customer] = MAX(Table1[customer])
)
)
The ALLSELECTED removes the local filter context while preserving any slicer filtering.
The filter Table1[customer] = MAX(Table1[customer]) is basically the measure equivalent of Table1[customer] = EARLIER(Table1[customer]) in a calculated column.

You can use subquery :
select *, (select max(t1.date)
from table t1
where t1.customer = t.customer
) as max_date_per_client
from table t;

Related

Selecting another column from a query with a different date filter

I am working with some sales data and pulling the metrics for a particular week I am defining in the filter.
However, I want to add another column (first_sale_date) to my query. This will show the first time this asin/mp combo shows up in my table regardless of the date filter I am trying to pull the other metrics for.
Because I am already
filtering by date I don't know how to look back to all of the data in the table to find it's first appearance as it is before the week I am filtering for.
select date,
,asin
,marketplace
,SUM(ordered_product_sales) as OPS
,SUM(cogs) as cogs
**,min(date) as first_sale_date**
from prod.sales
where date > '2023-01-01'
group by 1,2,3,4
you can use a correlated subquery for this
select year
,week
,asin
,marketplace
,SUM(ordered_product_sales) as OPS
,SUM(cogs) as cogs
,( SELECT MIN(date) FROM sales sal1 WHERe sal1.asin = sal.asin and sal1.marketplace = sal.marketplace) as first_sale_date
from prod.sales sal
where year = '2023' and week = '1'
group by 1,2,3,4

Calculate stdev over a variable range in SQL Server

Table format is as follows:
Date ID subID value
-----------------------------
7/1/1996 100 1 .0543
7/1/1996 100 2 .0023
7/1/1996 200 1 -.0410
8/1/1996 100 1 -.0230
8/1/1996 200 1 .0121
I'd like to apply STDEV to the value column where date falls within a specified range, grouping on the ID column.
Desired output would like something like this:
DateRange, ID, std_v
1 100 .0232
2 100 .0323
1 200 .0423
One idea I've had that works but is clunky, involves creating an additional column (which I've called 'partition') to identify a 'group' of values over which STDEV is taken (by using the OVER function and PARTITION BY applied to 'partition' and 'ID' variables).
Creating the partition variable involves a CASE statement prior where a given record is assigned a partition based on its date falling within a given range (ie,
...
, partition = CASE
WHEN date BETWEEN '7/1/1996' AND '10/1/1996' THEN 1
WHEN date BETWEEN '10/1/1996' AND '1/1/1997' THEN 2
...
Ideally, I'd be able to apply STDEV and the OVER function partitioning on the variable ID and variable date ranges (eg, say, trailing 3 months for a given reference date). Once this works for the 3 month period described above, I'd like to be able to make the date range variable, creating an additional '#dateRange' variable at the start of the program to be able to run this for 2, 3, 6, etc month ranges.
I ended up coming upon a solution to my question.
You can join the original table to a second table, consisting of a unique list of the dates in the first table, applying a BETWEEN clause to specify desired range.
Sample query below.
Initial table, with columns (#excessRets):
Date, ID, subID, value
Second table, a unique list of dates in the previous, with columns (#dates):
Date
select d.date, er.id, STDEV(er.value)
from #dates d
inner join #excessRet er
on er.date between DATEADD(m, -36, d.date) and d.date
group by d.date, er.id
order by er.id, d.date
To achieve the desired next step referenced above (making range variable), simply create a variable at the outset and replace "36" with the variable.

SQL query getting multiple where-claused aliases

Hoping you can help with this issue.
I have an energymanagement software running on a system. The data logged is the total value, logged in the column Value. This is done every hour. Along is some other data, here amongst a boolean called Active and an integer called Day.
What I'm going for, is one query that gets me the a list of sorted days, the total powerusage of the day, and the peak-powerusage of the day.
The peak-power usage is counted by using Max/Min of the value where Active is present. Somedays, however, the Active bit isn't set, and the result of this query alone would yield NULL.
This is my query:
SELECT
A.Day, A.Forbrug, B.Peak
FROM
(SELECT
Day, Max(Value) - Min(Value) AS Forbrug
FROM
EL_HT1_K
WHERE
MONTH = 8 AND YEAR = 2016
GROUP By Day) A,
(SELECT
Day, Max(Value) - Min(Value) AS Peak
FROM
EL_HT1_K
WHERE
Month = 8 AND Year = 2016 AND Active = 1
GROUP BY Day) B
WHERE
A.Day = B.Day
Which only returns the result where query B (Peak-usage) would yield results.
What I want, is that the rest of the results from inner query A, still is shown, even though query B yields 0/null for that day.
Is this possible, and how?
FYI. The reason I need this to be in one query, is that the scada system has some difficulties handling multiple queries.
I think you just want conditional aggregation. Based on your description, this seems to be the query you want:
SELECT Day, SUM(Value) as total,
MAX(CASE WHEN Active = 1 THEN Value END) as Peak,
FROM EL_HT1_K
WHERE Month = 8 AND Year = 2016
GROUP BY Day;

How to join to inner query and calculate column based on different groupings?

I have a table that contains data about a series of visits to shops.
The raw data for these visits can be found here.
My main table will have 1 row per Country, and will use something along the lines of:
Select Distinct o.Country from OtherTable as o
I need to add a new column to my main table, that uses the following calculation:
"Avg Visits by User" = (Sum of (No. Call IDs / No. unique User IDs)
for each day) / No. unique of days (based on Actual Start) for the
row.
I have formed this additional select statement to get the number of calls and users by day - but I am struggling to join this to my main table:
Select DATEPART(DAY, c.ActualStart) As 'Day',
CAST(CAST(COUNT(c.CallID) AS DECIMAL (5,1))/CAST(COUNT(Distinct c.UserID) AS DECIMAL (5,1)) AS DECIMAL (5,1)) as 'Value' from CallInfo as c
where (c.Status = 3))
Group by DATEPART(DAY, c.ActualStart)
For the country GB, I would expect to come to the see the following output:
Day Calls Users Calls / Users
13-Jun 29 8 3.625
14-Jun 31 7 4.428571429
So, in my main table, the calculation for my new column would be:
8.053571 / 2
Therefore, if I somehow add this to my table I would expect the following output:
Country Unique Days Sum of Calls/Users for each day) Final Calc
GB 2 8.053571429 4.026785714
I have tried adding this as a join, but I don't know how to join this to my main table. I could for example join on Call Id - but this would require the addition of a callID column in my inner query, and this would mean that the values are incorrect.
You can use a subquery to make calculations by day and after that make calculations by country. The result SQL query can be like this:
-- Make calculation by country, from the subquery
SELECT Country, UniqueDays = count(TheDay), CallsUserPerDay = sum(CallsPerUser),
FinalCalc = sum(CallsPerUser) / cast(count(TheDay) as DECIMAL)
FROM (
-- SUBQUERY: Make calculations by day
SELECT c.Country, c.ActualStart as TheDay,
Calls = COUNT(c.CallID),
Users = COUNT(Distinct c.UserID),
COUNT(c.CallID)
/CAST(COUNT(Distinct c.UserID) AS DECIMAL) as CallsPerUser
FROM CallInfo as c
WHERE (c.Status = 3)
GROUP BY c.Country, c.ActualStart
) data
GROUP BY Country
Note: I avoid use precission on DECIMAL casting to avoid rounding on final result.

SQL Aggregation / Window Function for Summarizing Data

I would like to create a query to do the following but I am having trouble:
I have a DB table with the columns:
TestYear (int, e.g. 2014)
Date (date, i.e. set of dates in a given year)
DailyWorstValue
RunningValue
Primary key is TestYear + Date
I would like to get the:
LAST RunningValue ordered by Date (i.e. the final value)
MINIMUM WorstValue (i.e. the worst value)
Per TestYear
This will basically be a one-row summary per TestYear. Is it possible to do this using window functions? Thank you very much in advance for any help that you can give.
Am not sure why you need window function to do this just aggregate function will do the job for you
SELECT testyear,
MIN_DailyWorstValue = Min(dailyworstvalue),
RV.last_runningvalue
FROM db_table A
CROSS apply (SELECT TOP 1 Last_RunningValue= runningvalue
FROM db_table B
WHERE A.testyear = B.testyear
ORDER BY date DESC) RV
GROUP BY testyear,
RV.last_runningvalue