Why MAX is NULL when projecting along the same dimension - ssas

Here is a vanilla "Sales" database:
CREATE TABLE Sales(id INT, category VARCHAR(50), item VARCHAR(50), date DATE, amount DECIMAL(10, 2));
INSERT INTO Sales VALUES
(1, 'Memory', 'Corsair 16GB', '2021-01-01', 200),
(2, 'Storage', 'Seagate BarraCuda 2TB', '2021-01-02', 50),
(3, 'Storage', 'Samsung 980 1TB', '2021-01-02', 150),
(4, 'OS', 'Windows 11', '2021-01-02', 150),
(5, 'OS', 'Ubuntu', '2021-01-03', 15),
(6, 'DBMS', 'MySQL Enterprise Edition 8', '2021-01-03', 5000),
(7, 'DBMS', 'SQL Server 2022', '2021-01-04', 15000),
(8, 'Memory', 'Corsair 16GB', '2021-01-04', 200),
(9, 'Memory', 'G.Skill Trident 32GB', '2021-01-04', 250),
(10, 'OS', 'Ubuntu', '2021-01-05', 15),
(11, 'DBMS', 'SQL Server 2022', '2021-01-06', 15000),
(12, 'DBMS', 'MySQL Enterprise Edition 8', '2021-01-06', 5000);
From it we build an SSAS Multidimensional cube with 3 dimensions:
Category with attribute Name,
Item with attribute Name,
Date with attribute Date.
So far so good.
Now creating some measures to get the maximum amounts along some dimensions:
WITH
MEMBER [Max Daily Sale] AS MAX([Date].[Date].Children, [Amount])
MEMBER [Max Category Sale] AS MAX([Category].[Name].Children, [Amount])
MEMBER [Max Category Daily Sale] AS MAX(([Category].[Name].Children, [Date].[Date].Children), [Amount])
MEMBER [Max Item Daily Sale] AS MAX(([Item].[Name].Children, [Date].[Date].Children), [Amount])
MEMBER [Max Item Sale] AS MAX([Item].[Name].Children, [Amount])
They are working fine except when one of the dimensions used in the MDX query is also used in the measure.
As an example:
SELECT
[Category].[Name].Children ON ROWS,
--[Item].[Name].Children ON ROWS,
--[Date].[Date].Children ON ROWS,
{ Amount, [Max Daily Sale], [Max Category Sale], [Max Category Daily Sale], [Max Item Sale], [Max Item Daily Sale] } ON COLUMNS
FROM Sales
Gives:
Amount Max Daily Sale Max Category Sale Max Category Daily Sale Max Item Sale Max Item Daily Sale
DBMS 40000 20000 (null) (null) 30000 15000
Memory 650 450 (null) (null) 400 250
OS 180 150 (null) (null) 150 150
Storage 200 200 (null) (null) 150 150
All the measures using dimension Category, [Max Category Sale] and [Max Category Daily Sale], result in NULL.
In the same way, using [Item].[Name] as the ROWS axis will "break" the [Max Item Sale] and [Max Item Daily Sale] measures.
And so on for [Max Daily Sale], [Max Category Daily Sale], and [Max Item Daily Sale], when using [Date].[Date].
It's obviously some MDX triviality I'm missing, but which one?

You are correct in what you observe. Each dimension can only be used in one section of the MDX: either on ROWS, on COLUMNS, in the WHERE clause, or within a calculated member. Sometimes an MDX statement will appear to break that rule, but look closely and you'll see that the dimension is only being used in one consistent way.
There are ways around this, often with the use of [Dimension name].currentMember but I'm not sure what to suggest for your particular case. Can this report be broken down into separate reports, to simply matters?

Related

Moving avg using OVER RANGE BETWEEN days

I want to find the average of item price bought within the last 365 days. Items are not guaranteed to be bought every day, so I can't fix the number of rows to look back at. So I am trying to use RANGE instead of ROWS, specifying that I look back 365 days from current row's date.
Sample data:
Group by Store and Item
I want to find the avg of prices bought within the last 12 months
Store
Item
Date bought
Price
Avg price across last 365 days
Store 1
Item 1
1/2/2022
1.00
1.00
Store 1
Item 1
6/1/2022
1.75
1.375
Store 1
Item 1
11/2/2022
2.10
1.617
Store 1
Item 1
1/5/2023
3.00
2.283
Store 2
Item 1
3/2/2022
1.55
1.55
Store 2
Item 1
5/5/2022
2.80
2.175
I have tried:
SELECT
store, item, date, price,
SUM(price) OVER (PARTITION BY store, item
ORDER BY date ASC
RANGE BETWEEN 365 DAY PRECEDING AND CURRENT ROW) AS avg_price
FROM table
Error I get is:
Msg 102, Level 15, State 1, Line 102
Incorrect syntax near 'DAY'
I have tried these variations to address the error but can't get past it:
RANGE BETWEEN '365' DAY PRECEDING AND CURRENT ROW
RANGE BETWEEN INTERVAL 365 DAY PRECEDING AND CURRENT ROW
RANGE BETWEEN 365 PRECEDING AND CURRENT ROW
#3 produces the error
Msg 4194, Level 16, State 1, Line 98
RANGE is only supported with UNBOUNDED and CURRENT ROW window frame delimiters.
Is this a syntax error? I am using Microsoft SQL Server Management Studio.
SELECT
store,
item,
date,
price,
AVG(price) AS avg_price
FROM table
WHERE
date > (select dateadd(year, -1, getdate()));
GROUP BY
store,
item,
date,
price
the WHERE query will reduce your data to all the input in the last year. SQL already comes with an averageing functions called AVG. Remove the GROUP BY if you don't want all of your data to be in groups.
A good old self-join should work (I converted your dates into ISO format):
with cte as (
select *
from (
VALUES (N'Store 1', N'Item 1', N'2022-01-02', 1.00, 1.00)
, (N'Store 1', N'Item 1', N'2022-06-01', 1.75, 1.375)
, (N'Store 1', N'Item 1', N'2022-11-01', 2.10, 1.617)
, (N'Store 1', N'Item 1', N'2023-01-05', 3.00, 2.283)
, (N'Store 2', N'Item 1', N'2022-03-02', 1.55, 1.55)
, (N'Store 2', N'Item 1', N'2022-05-05', 2.80, 2.175)
) t (Store,Item,[Date bought],Price,[Avg price across last 365 days])
)
select AVG(c2.price), c.Store, c.Item, c.[Date bought]
from CTE c
LEFT JOIN CTE c2
On c2.Store = c.Store
AND c2.Item = c.Item
AND c2.[Date bought] between DATEADD(YEAR, -1,CAST(c.[Date bought] AS DATETIME)) AND c.[Date bought]
GROUP BY c.Store, c.Item, c.[Date bought]

How to group and collect data in PostgreSQL by date periods?

Here is a demo data:
create table Invoices (
id INT,
name VARCHAR,
customer_id INT,
total_amount FLOAT,
state VARCHAR,
invoice_date DATE
);
INSERT INTO Invoices
(id, name, customer_id, total_amount, state, invoice_date)
VALUES
(1, 'INV/2020/0001', 2, 100, 'posted', '2020-04-05'),
(2, 'INV/2020/0002', 1, 100, 'draft', '2020-04-05'),
(3, 'INV/2020/0003', 2, 100, 'draft', '2020-05-24'),
(4, 'INV/2020/0004', 1, 100, 'posted', '2020-05-25'),
(5, 'INV/2020/0005', 2, 100, 'posted', '2020-06-05'),
(6, 'INV/2020/0006', 1, 100, 'posted', '2020-07-05'),
(7, 'INV/2020/0007', 1, 100, 'draft', '2020-08-24'),
(8, 'INV/2020/0008', 1, 100, 'posted', '2020-08-25'),
(9, 'INV/2020/0009', 1, 100, 'posted', '2020-09-05'),
(10, 'INV/2020/0010', 1, 100, 'draft', '2020-09-05'),
(11, 'INV/2020/0011', 2, 100, 'draft', '2020-10-24'),
(12, 'INV/2020/0012', 1, 100, 'posted', '2020-10-25'),
(13, 'INV/2020/0013', 2, 100, 'posted', '2020-11-05'),
(14, 'INV/2020/0014', 1, 100, 'posted', '2020-11-05'),
(15, 'INV/2020/0015', 2, 100, 'draft', '2020-11-24'),
(16, 'INV/2020/0016', 1, 100, 'posted', '2020-11-25')
I have a query that computes a sum of all posted invoices for customer with id = 1
SELECT sum(total_amount), customer_id
FROM Invoices
WHERE state = 'posted' AND customer_id = 1
GROUP BY customer_id
I need to group the data (sum(total_amount)) by 3 time periods - 2 or 3 months each (2 or 3 needs to be able to change by changing the number in the query. I want to pass it as a parameter to the query from python code).
Also I need to get the average sums of the period.
Can you help me please?
Expected output for period = 2 months is:
+--------------+--------------+--------------+--------+
| Period_1_sum | Period_2_sum | Period_3_sum | Avg |
+--------------+--------------+--------------+--------+
| 300 | 300 | 100 | 233.33 |
+--------------+--------------+--------------+--------+
You can use conditional aggregation for that:
SELECT customer_id,
sum(total_amount) as total_amount,
sum(total_amount) filter (where invoice_date >= date '2020-04-01' and invoice_date < date '2020-07-01') as period_1_sum,
sum(total_amount) filter (where invoice_date >= date '2020-07-01' and invoice_date < date '2020-10-01') as period_2_sum,
sum(total_amount) filter (where invoice_date >= date '2020-10-01' and invoice_date < date '2021-01-01') as period_3_sum
FROM Invoices
WHERE state = 'posted'
GROUP BY customer_id
By changing the filter condition you can control which rows are aggregated for each period.
Online example

Insert trigger for invoice table

I have an Invoice table with columns [customer no.], amount and [invoice no.].
I have to create a separate numbering of invoices per client, i.e. for client 1 we have invoices 1/1, 1/2, 1/3, 1/4 ... 1 / n and for each subsequent client k/n.
For example, for query
INSERT INTO [dbo].[Invoice] ([customer no.], [amount])
VALUES (1, 123.23), (1, 323.23), (2, 123.23), (3, 123.23), (4, 123.23)
the [invoice no.] column should get values 1/1, 1/2, 2/1, 3/1, 4/1.
Do you have any idea how to do this?

Use last value with operations in SQL Server

Let's assume I have a table in SQL Server called Budget_Spend like this
I know, with proper group by, sum and order by reach the next table (it's pretty obvious)
However, I don't how to replicate "Aviable" column, constructed following the logic:
For the first month, it's Budget - Spend - Taxes
For the following months is computed like PREVIOUS(Aviable)-CURRENT(Spend)-CURRENT(Taxes)
I've tried to use LAG function without succes (most of my tries didn't run due to syntax problems).
Any idea of doing? I imagine I need LAG and maybe a CASE in order to get the first value.
This is the DDL for creating the table
/* CREATE TABLE */
CREATE TABLE Budget_Spend(
Month DOUBLE,
Budget DOUBLE,
Spend DOUBLE,
Taxes DOUBLE);
/* INSERT */
INSERT INTO Budget_Spend(Month, Budget, Spend, Taxes) VALUES
(1, 1000, 75, 11.25);
INSERT INTO Budget_Spend(Month, Budget, Spend, Taxes) VALUES
(1, 1000, 25, 3.75);
INSERT INTO Budget_Spend(Month, Budget, Spend, Taxes) VALUES
(2, 1000, 200, 30);
INSERT INTO Budget_Spend(Month, Budget, Spend, Taxes) VALUES
(3, 1000, 150, 22.5);
INSERT INTO Budget_Spend(Month, Budget, Spend, Taxes) VALUES
(4, 1000, 10, 1.5);
INSERT INTO Budget_Spend(Month, Budget, Spend, Taxes) VALUES
(4, 1000, 10, 1.5);
You need window function :
select bs.*,
Budget - sum(Spend + Taxes) over (order by month) as Available
from (select month, Budget, sum(Spend) as Spend, sum(Taxes) as Taxes
from Budget_Spend bs
group by month, Budget
) bs;

Find the customers who bought ProductA in any month and then bought ProductB in the immediate next month

Consider this table,
CREATE TABLE ProductSale
(
cust INT,
[Month] INT,
amt INT,
product VARCHAR(255)
)
INSERT INTO ProductSale (cust, Month, amt, product)
VALUES (103, 11, 493, 'pizza'), (103, 12, 304, 'drink'),
(103, 10, 189, 'drink'), (100, 12, 270, 'pizza'),
(100, 11, 187, 'drink'), (102, 8, 378, 'drink'),
(101, 10, 490, 'drink'), (101, 9, 123, 'Pizza')
Customer buy one product in a month and followup with buying another product next month.
I would like to get records of customers who bought Pizza in any month and then bought drink in the immediate next month.
For example, 103 is such customer. 100 looks like one, but he is not.
How can I achieve this using a SQL query?
You may achieve this by using cross apply.
select p.* from ProductSale as p
cross apply (
select * from ProductSale as ps
where p.cust=ps.cust
and p.month+1=ps.month
and ps.product = 'drink'
and p.product='pizza' ) as pg