slowly changing dimension in MDX/OLAP - mdx

I have got an OLAP cube that has purchased per date, per vendor and some other dimensions.
Below a sample of the data. The vendor is identified by the unique id VendorID:
Date CCID GLID CatID VendorID Amount
31-3-2012 659 55 25 807 124.5
14-5-2012 425 74 1 1452 371.53
1-4-2012 353 55 106 1648 26.79
2-7-2012 339 78 25 1275 1208
8-7-2012 460 66 41 4311 763.25
The vendor itself has a score with values 1-good, 2-average, 3-poor, 4-unattended. These scores vary over time.
Example for vendor 807:
VendorID VendorIDDate Score
807 1-1-2012 4-unattended
807 27-2-2013 2-average
807 1-4-2014 3-poor
807 31-12-2014 1-good
Now when I start a query I would like to count the number of vendors per Score for a specific slicer on GLID, CCID and CatID on a certain date.
What is the best way to model this?
I know I can add the score to the basic fact table using a look-up for each date, but I assume there is a much better way.

Related

Dynamically Calculate difference columns based off slicer- POWERBI

I have a table with quarterly volume data, and a slicer that allows you to choose what quarter/year you want to see volume per code for. The slicer has 2019Q1 through 2021Q4 selections. I need to create dynamic difference column that will adjust depending on what quarter/year is selected in the slicer. I know I need to create a new measure using Calculate/filter but am a beginner in PowerBI and am unsure how to write that formula.
Example of raw table data:
Code
2019Q1
2019Q2
2019Q3
2019Q
2020Q1
2020Q2
2020Q3
2020Q4
11111
232
283
289
19
222
283
289
19
22222
117
481
231
31
232
286
2
19
11111
232
397
94
444
232
553
0
188
22222
117
411
15
14
232
283
25
189
Example if 2019Q1 and 2020Q1 are selected:
Code
2019Q1
2020Q1
Difference
11111
232
222
10
22222
117
481
-364
11111
232
397
-165
22222
117
411
-294
Power BI doesn't work that way. This is an Excel pivot table setup. You don't have any parameter to distinguish first and third or second and fourth row. They have the same code, so Power BI will aggregate their volumes. You could introduce a hidden index column but then why don't you simply stick to Excel? The Power BI approch to the problem would be to unpivot (stack) your table to a Code, Quarter and a Volume column, create 2 independent slicer tables for Minuend and Subtrahend and then CALCULATE your aggregated differences based on the SELECTEDVALUE of the 2 slicers.

how to select a value based on multiple criteria

I'm trying to select some values based on some proprietary data, and I just changed the variables to reference house prices.
I am trying to get the total offers for houses where they were sold at the bid or at the ask price, with offers under 15 and offers * sale price less than 5,000,000.
I then want to get the total number of offers for each neighborhood on each day, but instead I'm getting the total offers across each neighborhood (n1 + n2 + n3 + n4 + n5) across all dates and the total offers in the dataset across all dates.
My current query is this:
SELECT DISTINCT(neighborhood),
DATE(date_of_sale),
(SELECT SUM(offers)
FROM `big_query.a_table_name.houseprices`
WHERE ((offers * accepted_sale_price < 5000000)
AND (offers < 15)
AND (house_bid = sale_price OR
house_ask = sale_price))) as bid_ask_off,
(SELECT SUM(offers)
FROM `big_query.a_table_name.houseprices`) as
total_offers,
FROM `big_query.a_table_name.houseprices`
GROUP BY neighborhood, DATE(date_of_sale) LIMIT 100
Which I am expecting a result like, with date being repeated throughout as d1, d2, d3, etc.:
but am instead receiving
I'm aware that there are some inherent problems with what I'm trying to select / group, but I'm not sure what to google or what tutorials to look at in order to perform this operation.
It's querying quite a bit of data, and I want to keep costs down, as I've already racked up a smallish bill on queries.
Any help or advice would be greatly appreciated, and I hope I've provided enough information.
Here is a sample dataframe.
neighborhood date_of_sale offers accepted_sale_price house_bid house_ask
bronx 4/1/2022 3 323 320 323
manhattan 4/1/2022 4 244 230 244
manhattan 4/1/2022 8 856 856 900
queens 4/1/2022 15 110 110 135
brooklyn 4/2/2022 12 115 100 115
manhattan 4/2/2022 9 255 255 275
bronx 4/2/2022 6 330 300 330
queens 4/2/2022 10 405 395 405
brooklyn 4/2/2022 4 254 254 265
staten_island 4/3/2022 2 442 430 442
staten_island 4/3/2022 13 195 195 225
bronx 4/3/2022 4 650 650 690
manhattan 4/3/2022 2 286 266 286
manhattan 4/3/2022 6 356 356 400
staten_island 4/4/2022 4 361 361 401
staten_island 4/4/2022 5 348 348 399
bronx 4/4/2022 8 397 340 397
manhattan 4/4/2022 9 333 333 394
manhattan 4/4/2022 11 392 325 392
I think that this is what you need.
As we group by neighbourhood we do not need DISTINCT.
We take sum(offers) for total_offers directly from the table and bids from a sub-query which we join to so that it is grouped by neighbourhood.
SELECT
h.neighborhood,
DATE(h.date_of_sale) AS date_,
s.bids AS bid_ask_off,
SUM(h.offers) AS total_offers,
FROM
`big_query.a_table_name.houseprices` h
LEFT JOIN
(SELECT
neighborhood,
SUM(offers) AS bids
FROM
`big_query.a_table_name.houseprices`
WHERE offers * accepted_sale_price < 5000000
AND offers < 15
AND (house_bid = sale_price OR
house_ask = sale_price)
GROUP BY neighborhood) s
ON h.neighborhood = s.neighborhood
GROUP BY
h.neighborhood,
DATE(date_of_sale),
s.bids
LIMIT 100;
Or the following which modifies more the initial query but may be more like what you need.
SELECT
h.neighborhood,
DATE(h.date_of_sale) AS date_,
s.bids AS bid_ask_off,
SUM(h.offers) AS total_offers,
FROM
`big_query.a_table_name.houseprices` h
LEFT JOIN
(SELECT
date_of_sale dos,
neighborhood,
SUM(offers) AS bids
FROM
`big_query.a_table_name.houseprices`
WHERE offers * accepted_sale_price < 5000000
AND offers < 15
AND (house_bid = sale_price OR
house_ask = sale_price)
GROUP BY
neighborhood,
date_of_sale) s
ON h.neighborhood = s.neighborhood
AND h.date_of_sale = s.dos
GROUP BY
h.neighborhood,
DATE(date_of_sale),
s.bids
LIMIT 100;

Applying Logic to Sets of Rows

I want to add logic that calculates price per claim. Below, there are two claims, one for patient 5, and another for patient 6. Original idea is to create a unique list of patient numbers in a separate table, then sort the original table by these unique patient numbers and run conditional statements to output a single value (reimbursement value).Then iterate through the unique table until completed. Does this sound like a feasible workflow? Not necessarily looking for specific code but more of a workflow/process
For example/context:
PatNo
RevCode
CPT
BilledCharges
DRG
5
141
null
100
880
5
636
J1234
50
null
6
111
null
8000
783
6
636
J1234
300
null
PSYCH look up table: if claim has DRG on table then calculate 75% of BilledCharges for claim.
DRG
Service Category
876
PSYCH
880
PSYCH
881
PSYCH
882
PSYCH
883
PSYCH
884
PSYCH
885
PSYCH
886
PSYCH
887
PSYCH
C- Section look up table: if claim has DRG on table pay $5000 for claim.
DRG
Service
765
C-SECTION
766
C-SECTION
783
C-SECTION
784
C-SECTION
786
C-SECTION
787
C-SECTION
785
C-SECTION
788
C-SECTION
If claim has RevCode 636, then add 50% of charges to claim reimbusment.
OUTPUT:
PatNo
Reimburs.
5
100
6
5150
So...
Patient 5's reimbursement is...(75% of 100) + (50% of 50) = 100
Patient 6's reimbursement is...(5000) + (50% of 300)
Assuming you've told us all the rules...
You can left join the tables, to check if values are present there or not, then use case expressions to apply the logic, and finally aggregate it to sum it all up...
SELECT
YourTable.patno,
SUM(
CASE WHEN section.drg IS NOT NULL THEN 5000
WHEN psych.drg IS NOT NULL THEN 0.75 * yourTable.billedcharges
WHEN yourTable.revcode = 636 THEN 0.5 * yourTable.billedcharges
ELSE 0
END
)
FROM
yourTable
LEFT JOIN
section
ON section.drg = yourTable.drg
LEFT JOIN
psych
ON psych.drg = yourTable.drg
GROUP BY
yourTable.patno
Please forgive typos, I'm on my phone.

Hidden Parameters for drop down selections using tlist/SQL

To give you some perspective on the question on hand, I created a report for a student information system which pulls student logs, based on the selectable criteria from a set of drop downs, on an html page. The drop down lists are populated using tlist/sql. The current report only has 3 drop downs:
Start Date, End Date, and Sports Logs. Here is the code for the Sports drop down:
SELECT DISTINCT log.logtypeid,
CASE WHEN log.subtype is null THEN ' ' ELSE log.subtype END subID,
lt.Name logtype,
CASE WHEN to_char(st.ValueT) is null THEN ' - NONE' ELSE ' - ' || to_char(st.ValueT) END subtype
FROM log
INNER JOIN gen lt ON log.logtypeid = lt.id
LEFT OUTER JOIN gen st ON st.Name = to_char(lt.ID)
AND st.value = log.subtype
AND st.Cat = 'subtype'
WHERE lt.Cat = 'logtype'
AND log.logtypeid = '3935'
ORDER BY subtype
Now in order for the report to pull as it was designed, I believe I need each selection from the Sports drop down to pull some data that will not be listed or displayed in the drop down. In its current state, and the way it should stay is...
Sports - Baseball
Sports - Bowling
Sports - Boys Basketball
Sports - Boys Golf
Sports - Dance
Sports - Diving
Sports represents logtypeid 3935. Baseball is subtype 101, bowling is subtype 102, etc.
For the report to pull the data as designed, there are two additional subtypes that need to be pulled, but unfortunately they hold a different logtypeid, 626. So if student id 1 has a 3935(logtypeid), 101(subtype), it should also pull his 626(logtypeid) 29(subtype) and 626(logtypeid) 43(subtype), if he/she has them.
Data Example:
STUDENTID LOGTYPEID SUBTYPE
6382 626 27
6382 626 41
6382 626 38
6382 626 43
6382 626 29
6382 3935 109
6382 3935 117
6383 626 43
6383 626 30
6383 626 43
6383 626 25
6383 626 43
6383 626 14
6383 3935 117
6400 626 38
6401 626 28
6401 626 36
6401 3935 110
6402 15 3
6405 3935 101
6405 3935 115
6405 626 29
6405 626 43
so to simplify(I hope), all 3935's should be displayed in the Sports dropdown, with there corresponding subtypes. Any studentid that holds that criteria selected(can only select one sport at a time), the report should also pull there logtypeid 626, subtype 29, AND logtypeid 626 subtype 43. I have tried everything I could possibly think of. I believe the answer may fall between CASE WHEN but I am unsure of the syntax. Any help or suggestions would be greatly appreciated. Thank you in advance.

How to find number of references given another table in sql?

I have two tables:
person table accident table
pid name phone acc_id pid type
1 Mike 3123223232 132 1 car
2 Kyle 133 3 snow
3 Nick 3124567654 134 4 cold
4 John 3124566663 135 2 sun
5 Pety 4234345453 136 3 hot
137 2 sun
138 3 snow
139 2 cold
140 1 hot
I need to find all accidents acc_id with a reference to each other that incurred to the same person given that she has a valid telephone number
So the result would be the following:
acc_id reference
133 136
133 138
136 133
136 138
138 133
138 136
132 140
140 132
So, person with pid = 3 had accidents 133, 136, 138 and this person has a phone, thus these three acc_id refer to each other. Next, pid = 2 also had three accidents, however since her phone number is unknown, we do not include her. Next, pid = 1 had two accidents 132, 140 and she has a phone number , so we include her accident numbers.
I know a method how to write a query to do this (for the sake of space I did not include), but it includes joining these tables two times and I think that there must be a more efficient way. Can anybody help me?
How about something like this? (not sure if this is what you already had)
select acc1.acc_id, acc2.acc_id as reference
from accidents acc1
inner join accidents acct2 on acc1.pid = acc2.pid and acc1.acc_id <> acc2.acc_id
inner join people on people.pid = acc1.pid
where people.phone <> ""