Calculating filtered cost - ssas

(Edited for clarity)
I am new to DAX and tabular, and have run into a problem
We have a tabular model with multiple fact tables, sharing some dimensions.
Now our challenge is that cost for one fact, should be calculated on the basis of cost on another table, using a order relationship.
The idea is that the DBM cost is the correct one, but the cost in Adform might be from some other channel, and in that case we should use the cost from Adform.
So we need to check if there is an associated DBM cost related by orders.
We have:
Adform Fact
Adform ID, Order Id, Impressions, Cost
1 , 1 , 100 , 50
2 , 2 , 200 , 68
3 , 2 , 200 , 100
4 , 3 , 200 , 100
5 , -1 , 600 , 300
DBM Fact
DBM ID, Order Id, Impressions, Cost
1 , 1 , 50 , 25
2 , 1 , 20 , 10
3 , 1 , 10 , 10
4 , 2 , 60 , 100
5 , 2 , 80 , 75
6 , -1 , 500 , 1000
And Order dimension
Order Id, Order Name
1 , "Campaign 1"
2 , "Campaign 2"
3 , "Campaign 3"
-1 , "Unknown Order"
Now we need to do the following.
Cost has to be taken from the Adform Cost, unless the same order has associated cost in the DBM table. In this case Cost should be calculated as the sum of the cost from DBM filtered on common dimensions.
I've tried this:
CALCULATE(if(and(COUNTROWS('AdForm')>0,[Cost (DBM)]>0) ,[Cost (DBM)],[Cost (AdForm)]))
This works as expected when I drill down on orders. However it only does this on the aggregated level, so when I drill down on other common dimensions such as Date, or Client, I simply get the sum of DBM cost as the cost.
What I would like to get is the sum DBM Cost of all orders where there is a DBM Cost, and for all other orders, to get the Adform Cost.

Tight spots can be uncomfortable, but remember that diamonds are only made under extreme pressure...
I would first add Measures:
Adform Cost = SUM ( 'Adform Fact'[Cost] )
DBM Cost = SUM ( 'DBM Fact'[Cost] )
Then I would use SUMX to run a similar (but simplified) calc on a row-by-row basis across the Orders table:
Order Cost =
SUMX (
'Order dimension',
IF ( ISBLANK ( [DBM Cost] ), [Adform Cost], [DBM Cost] )
)

Related

How to fetch data from same column using different criteria?

Hi I have table like this in SQL anywhere database
CUSTID-------DDAT.----------AMOUNT
1. 01-01-2021. 1000
1. 02-02-2021. 2000
1. 03-02-2021. 3000
1. 04-02-2021. 4000
2. 01-04-2021. 1000
2. 02-04-2021. 2000
04-04-2021. 1000
I want data like this in VB.net where amount is only for one date and total amount is for 4 day
Cust id.------date ---------------Amount.-------Total amount
1. 04-04-2021. 4000. 10000
2. 04-04-2021. 1000. 4000
Can you give me any solution..thanks in advance
My take on it:
select custid, dat, amount, total_amount
from (
select custid, dat, amount, sum(amount) over (partition by custid) as total_amount
from data
) d
where dat = '2021-04-04' -- or any other date
Might be that the inner select is all that you need. Not sure if the filter on date is necessary.

How to create a list of customers who buy less over time?

I have customer transactions data as shown below and need to create a list of only those customers whose total ordered units is consistently lesser than the previous order. i.e.Total Qty purchased in the nth order is less than Total Qty purchased in n-1 th order, and the next previous order is also less. Another way to say it, list the customers buy fewer units as time goes forward.
Custid date units
123 28-03-17 100
123 27-03-17 100
123 26-03-17 100
999 25-03-17 10
999 24-03-17 20
893 24-03-17 39
893 28-03-17 48
893 24-03-17 10
893 19-03-17 75
893 12-02-17 10
Such that the output of the code should be customer 999.
I initially thought of using the lag function after sorting the transactions and then using a conditional statement but the number of transactions varies across customers.
Regards
Use lag() and conditional aggregation:
select custid
from (select t.*,
lag(units) over (partition by custid order by date) as prev_units
from t
) t
group by custid
having sum(case when units >= pev_units then 1 else 0 end) = 0;

Calculate percentages of columns in Oracle SQL

I have three columns, all consisting of 1's and 0's. For each of these columns, how can I calculate the percentage of people (one person is one row/ id) who have a 1 in the first column and a 1 in the second or third column in oracle SQL?
For instance:
id marketing_campaign personal_campaign sales
1 1 0 0
2 1 1 0
1 0 1 1
4 0 0 1
So in this case, of all the people who were subjected to a marketing_campaign, 50 percent were subjected to a personal campaign as well, but zero percent is present in sales (no one bought anything).
Ultimately, I want to find out the order in which people get to the sales moment. Do they first go from marketing campaign to a personal campaign and then to sales, or do they buy anyway regardless of these channels.
This is a fictional example, so I realize that in this example there are many other ways to do this, but I hope anyone can help!
The outcome that I'm looking for is something like this:
percentage marketing_campaign/ personal campaign = 50 %
percentage marketing_campaign/sales = 0%
etc (for all the three column combinations)
Use count, sum and case expressions, together with basic arithmetic operators +,/,*
COUNT(*) gives a total count of people in the table
SUM(column) gives a sum of 1 in given column
case expressions make possible to implement more complex conditions
The common pattern is X / COUNT(*) * 100 which is used to calculate a percent of given value ( val / total * 100% )
An example:
SELECT
-- percentage of people that have 1 in marketing_campaign column
SUM( marketing_campaign ) / COUNT(*) * 100 As marketing_campaign_percent,
-- percentage of people that have 1 in sales column
SUM( sales ) / COUNT(*) * 100 As sales_percent,
-- complex condition:
-- percentage of people (one person is one row/ id) who have a 1
-- in the first column and a 1 in the second or third column
COUNT(
CASE WHEN marketing_campaign = 1
AND ( personal_campaign = 1 OR sales = 1 )
THEN 1 END
) / COUNT(*) * 100 As complex_condition_percent
FROM table;
You can get your percentages like this :
SELECT COUNT(*),
ROUND(100*(SUM(personal_campaign) / sum(count(*)) over ()),2) perc_personal_campaign,
ROUND(100*(SUM(sales) / sum(count(*)) over ()),2) perc_sales
FROM (
SELECT ID,
CASE
WHEN SUM(personal_campaign) > 0 THEN 1
ELSE 0
end AS personal_campaign,
CASE
WHEN SUM(sales) > 0 THEN 1
ELSE 0
end AS sales
FROM the_table
WHERE ID IN
(SELECT ID FROM the_table WHERE marketing_campaign = 1)
GROUP BY ID
)
I have a bit overcomplicated things because your data is still unclear to me. The subquery ensures that all duplicates are cleaned up and that you only have for each person a 1 or 0 in marketing_campaign and sales
About your second question :
Ultimately, I want to find out the order in which people get to the
sales moment. Do they first go from marketing campaign to a personal
campaign and then to sales, or do they buy anyway regardless of these
channels.
This is impossible to do in this state because you don't have in your table, either :
a unique row identifier that would keep the order in which the rows were inserted
a timestamp column that would tell when the rows were inserted.
Without this, the order of rows returned from your table will be unpredictable, or if you prefer, pure random.

Find the proportion of rows verifying a condition in a single SQL query

Suppose I have a sales table which is as follows:
ID | Price
----------------------
1 0.33
2 1.5
3 0.5
4 10
5 0.99
I would like to find, in a single query, the proportion of rows verifying a given condition. For example, if the condition is Price < 1, the result should be 3/5 = 0.6.
The only workaround that I have found so far is :
SELECT
SUM(
CASE
WHEN Price < 1
THEN 1
WHEN Price >= 1
THEN 0
END
)/COUNT(*)
FROM sales
but is there a way to do this without CASE ?
You can do it with IF:
SELECT SUM(IF(Price < 1, 1, 0))/COUNT(*) FROM sales
-but it's no big difference from CASE (your logic here is correct)
You may want to use WHERE (to sum only Price<1) - but since you're using total COUNT it's not valid in your case. Another option: get total count separately:
SELECT
COUNT(sales.Price)/total_count
FROM
sales
CROSS JOIN (SELECT COUNt(*) AS total_count FROM sales) AS c
WHERE
-- you're summing 1 or 0 depending of Price, so your sum is
-- just count where Price<1
sales.Price<1

TERADATA Creating Group ID from Rolling SUM Limit

I have a list of products and a count corresponding to the quantity sold in a single table. The data is laid out as such:
Product Name QTY_SOLD
Mouse 23
Keyboard 25
Monitor 56
TV 10
Laptop 45
...
I want to create a group ID where groups are created if the ROLLING sum of the quantity sold is greater than 50. We can order by Product Name to get an output similar to the following.
Product Name QTY_SOLD GROUP_NBR
Keyboard 25 1
Laptop 45 1
Monitor 56 2
Mouse 23 3
TV 10 3
I created a case statement to create the output I need but if I want to change the group id cutoff from 50 to say 100 or if i get more products and quantities I have to keep changing the case statement. Is there an easy way to use either recursion or some other method to accomodate this?
This works on Teradata 13.10
UPDATE main FROM prod_list AS main,
(
SEL PROD_NAME
, QTY_SOLD
, SUM(QTY_SOLD) OVER (ORDER BY PROD_NAME ROWS UNBOUNDED PRECEDING) RUNNING FROM prod_list
) inr
SET GROUP_NBR = CASE
WHEN RUNNING < 50 THEN 1
WHEN RUNNING > 50 AND RUNNING < 100 THEN 2
WHEN RUNNING > 100 AND RUNNING < 150 THEN 3
WHEN RUNNING > 150 AND RUNNING < 200 THEN 4
WHEN RUNNING > 200 AND RUNNING < 250 THEN 5
ELSE 6
END
WHERE main.PROD_NAME = inr.PROD_NAME ;
When i first saw your question i thought it was a kind of bin-packing problem. But your query looks like you simply want to put your data into n buckets :-)
Teradata supports the QUANTILE function, but it's deprecated and it doesn't fit your requirements as it creates buckets with equal size. You need WIDTH_BUCKET which creates (as the name implies) buckets of equal width:
SELECT
PROD_id
, COUNT(DISTINCT PROD_ID) AS QTY
, SUM(QTY) OVER (ORDER BY QTY ROWS UNBOUNDED PRECEDING) RUNNING
, WIDTH_BUCKET(RUNNING, 1, 120*2000000, 120) AS GRP_NBR
FROM TMP_WORK_DB.PROD_LIST
GROUP BY 1
You can easily change the size of a bucket (2000000) or the number of buckets (120).
Create a reference table and join it...then the change only needs to be done in a table (can even create a procedure to help automate the changes to the table later on)
Psuedo create:
Create table group_nbr (low_limit,upper_limit,group_nbr)
Insert your case values into that table and inner join to it using greater than and less than conditions.
select *, group_nbr.group_nbr
from table inner join group_nbr on RUNNING > lower_limit and RUNNING < upper_limit
Code won't quite work as it sits there, but hopefully you get the idea well enough to alter your code to it. I find leaving these values in reference tables like this far easier than altering code. You can even allow multiple group_nbr setups by adding a 'group_id' to the group_nbr table and having group_id 1 be one set of running limits and group_id of 2,3,4,5 etc having different sets of running limits and use a where clause to choose which group_id you want to use.
Hope the below logic helps,if its about increments of 50.
UPDATE main FROM prod_list AS main,
(
SEL PROD_NAME
, QTY_SOLD
, SUM(QTY_SOLD) OVER (ORDER BY PROD_NAME ROWS UNBOUNDED PRECEDING) RUNNING FROM prod_list
) inr
SET GROUP_NBR = RUNNING /50
WHERE main.PROD_NAME = inr.PROD_NAME ;
This is the code I created on Twelfth's suggestion.
-- create the first entry for the recursive query
INSERT TMP_WORK_DB.GRP_NBRS VALUES (0,1,0,2000000);
INSERT TMP_WORK_DB.GRP_NBRS (GRP_NBR,LOWER_LIMIT, UPPER_LIMIT)
WITH RECURSIVE GRP_RECRSV (GRP_NBR, LOWER_LIMIT, UPPER_LIMIT)
AS (
SELECT
1 AS GRP_NBR
, LOWER_LIMIT
, UPPER_LIMIT
FROM TMP_WORK_DB.GRP_NBRS
UNION ALL
SELECT
GRP_NBR + 1
, LOWER_LIMIT + 2000000 -- set the interval to 2 million
, UPPER_LIMIT + 2000000 -- can be adjusted as needed
FROM GRP_RECRSV
WHERE GRP_NBR < 120 -- needed a limit so that it would not be endless
)
SELECT * FROM GRP_RECRSV
;
-- delete the first entry because it was duplicated
DELETE FROM TMP_WORK_DB.GRP_NBRS WHERE GRP_NBR = 0;
-- set grp nbr using the limits table
INSERT TMP_WORK_DB.PROD_LIST_GRP
WITH NUMOFPRODS (PROD_NAME,QTY,RUNNING) AS
(
SELECT
PROD_NAME
, COUNT(DISTINCT PROD_ID) AS QTY
, SUM(QTY) OVER (ORDER BY QTY ROWS UNBOUNDED PRECEDING) RUNNING
FROM TMP_WORK_DB.PROD_LIST
GROUP BY 1
)
SELECT
PROD_NAME
, QTY
, RUNNING
, GRP_NBR
FROM NUMOFPRODS a
JOIN TMP_WORK_DB.GRP_NBRS b ON RUNNING BETWEEN LOWER_LIMIT AND UPPER_LIMIT
;