SSAS Calculated Member

SSAS Calculated Member - ssas

I have a cube with a measure called FactSales, which has entries for each day.
I have three Dimensions, Date, Customer and CustomerType.
Each FactSales row is linked to a date and customer by foreign key. customer is linked to customer type by foreign key.
From this I am able to spit out all sales figure for each customer on each date which is great.
I have multiple types : typeA, typeB, typeC, typeD, typeE.
What I want though is to create two calculated members which have the values aggregated for each customer by typeA, and then by everyother type.
What I have at the moment is something like
Case
When IsEmpty( [Measures].[FactSales] ) or [Customer].[CustomerType].currentmember <> [Customer].[CustomerType].&[typeA]
Then null
ELSE ([Customer].[CustomerType].&[typeA], [Measures].[FactSales] )
END
but I think this is wrong, and i also can't use this same method to get the value of all the other types excluding the typeA. I also dont get the aggregation when i roll it up to a higher level.
Can anyone help? I may not have explained my self well enough so please let me know if you need more info.

Why you are trying to do that? It's not good to store any calculation that is not related to single fact, data in fact table row should correspond only to one system fact (order, job, etc), and with aggregation you will be able to get data based on any dimension.
Your current structure is well and you can easily get calculations by slicing the cube over customer type dimension.

Related

I need help counting char occurencies in a row with sql (using firebird server)

I have a table where I have these fields:
id(primary key, auto increment)
car registration number
car model
garage id
and 31 fields for each day of the mont for each row.
In these fields I have char of 1 or 2 characters representing car status on that date. I need to make a query to get number of each possibility for that day, field of any day could have values: D, I, R, TA, RZ, BV and LR.
I need to count in each row, amount of each value in that row.
Like how many I , how many D and so on. And this for every row in table.
What best approach would be here? Also maybe there is better way then having field in database table for each day because it makes over 30 fields obviously.

There is a better way. You should structure the data so you have another table, with rows such as:
CarId
Date
Status
Then your query would simply be:
select status, count(*)
from CarStatuses
where date >= #month_start and date < month_end
group by status;
For your data model, this is much harder to deal with. You can do something like this:
select status, count(*)
from ((select status_01 as status
from t
) union all
(select status_02
from t
) union all
. . .
(select status_31
from t
)
) s
group by status;

You seem to have to start with most basic tutorials about relational databases and SQL design. Some classic works like "Martin Gruber - Understanding SQL" may help. Or others. ATM you miss the basics.
Few hints.
Documents that you print for user or receive from user do not represent your internal data structures. They are created/parsed for that very purpose machine-to-human interface. Inside your program should structure the data for easy of storing/processing.
You have to add a "dictionary table" for the statuses.
ID / abbreviation / human-readable description
You may have a "business rule" that from "R" status you can transition to either "D" status or to "BV" status, but not to any other. In other words you better draft the possible status transitions "directed graph". You would keep it in extra columns of that dictionary table or in one more specialized helper table. Dictionary of transitions for the dictionary of possible statuses.
Your paper blank combines in the same row both totals and per-day detailisation. That is easy for human to look upon, but for computer that in a sense violates single responsibility principle. Row should either be responsible for primary record or for derived total calculation. You better have two tables - one for primary day by day records and another for per-month total summing up.
Bonus point would be that when you would change values in the primary data table you may ask server to automatically recalculate the corresponding month totals. Read about SQL triggers.
Also your triggers may check if the new state properly transits from the previous day state, as described in the "business rules". They would also maybe have to check there is not gaps between day. If there is a record for "march 03" and there is inserted a new the record for "march 05" then a record for "march 04" should exists, or the server would prohibit adding such a row. Well, maybe not, that is dependent upon you business processes. The general idea is that server should reject storing any data that is not valid and server can know it.
you per-date and per-month tables should have proper UNIQUE CONSTRAINTs prohibiting entering duplicate rows. It also means the former should have DATE-type column and the latter should either have month and year INTEGER-type columns or have a DATE-type column with the day part in it always being "1" - you would want a CHECK CONSTRAINT for it.
If your company has some registry of cars (and probably it does, it is not looking like those car were driven in by random one-time customers driving by) you have to introduce a dictionary table of cars. Integer ID (PK), registration plate, engine factory number, vagon factory number, colour and whatever else.
The per-month totals table would not have many columns per every status. It would instead have a special row for every status! The structure would probably be like that: Month / Year / ID of car in the registry / ID of status in the dictionary / count. All columns would be integer type (some may be SmallInt or BigInt, but that is minor nuancing). All the columns together (without count column) should constitute a UNIQUE CONSTRAINT or even better a "compound" Primary Key. Adding a special dedicated PK column here in the totaling table seems redundant to me.
Consequently, your per-day and per-month tables would not have literal (textual and immediate) data for status and car id. Instead they would have integer IDs referencing proper records in the corresponding cars dictionary and status dictionary tables. That you would code as FOREIGN KEY.
Remember the rule of thumb: it is easy to add/delete a row to any table but quite hard to add/delete a column.
With design like yours, column-oriented, what would happen if next year the boss would introduce some more statuses? you would have to redesign the table, the program in many points and so on.
With the rows-oriented design you would just have to add one row in the statuses dictionary and maybe few rows to transition rules dictionary, and the rest works without any change.
That way you would not

Calculated member for dimensions

Firstly gonna show you example. We've got a fact table with some id, which is not primary key. Also we have dimension with all ids from fact table and names for that. Our id from fact table is a measure with aggregation function max. Is it possible to create calculated member, which will show name from our dimension using id from fact table? I know that it could be solved using rn and that structure:
Dimension.Hierahchy.Level.Item (meadures.rn).name
But is it possible to solve this another way?
We need to get key for number from measure. Something like that
Dimension.Hierahchy.Level.&[value of measures.maxid]

In mdx you can easily extract a maximum key of a set of members.
MAX(
Dimension.Hierahchy.Level.MEMBERS,
Dimension.Hierahchy.CurrentMember.MEMBERKEY
)
(the above is total guesswork as your current question does not include any example of mdx that you have already tried)

Multiple Joins from one Dimension Table to single Fact table

I have a fact table that has 4 date columns CreatedDate, LoginDate, ActiveDate and EngagedDate. I have a dimension table called DimDate whose primary key can be used as foreign key for all the 4 date columns in fact table. So the model looks like this.
But the problem is, when I want to do sub-filtering for the measures based on the date column. For ex: Count all users who were created in the last month and are engaged in this month. This is not possible to do with this design, coz when I filter the measure with create date , I can’t further filter for a different time window for engaged date. Since all the connected to same dimension, they are not working independently.
However, If I create a separate date dimension table for each of the columns, and join them like this then it works.
But this looks very cumbersome when I have 20 different date columns in fact table in real world scenario, where I have to create 20 different dimensions and connect them one by one. Is there any other way I can achieve my scenario w/o creating multiple duplicated date dimensions?

This concept is called a role-playing dimension. You don't have to add the table to the DSV or the actual dimensions one time for each date. Instead add the date once, then go to the dimension usage tab. Click Add Cube Dimension, and then choose the date dim. Right-click and rename it. Then update the relationship to use the correct fields.
There's a good article on MSSQLTips.com that covers this topic.

Customer Dimension as Fact Table in Star Schema

Can Dimension Table became a fact table as well? For instance, I have a Customer dimension table with standard attributes such as name, gender, etc.
I need to know how many customers were created today, last month, last year etc. using SSAS.
I could create faceless fact table with customer key and date key or I could use the same customer dimension table because it has both keys already.
Is it normal to use Customer Dimension table as both Fact & Dimension?
Thanks

Yes, you can use a dimension table as fact table as well. In your case, you would just have a single measure which would be the count - assuming there is one record per customer in this customer table. In case you would have more than one record per customer, e. g. as you use a complex slowly changing dimension logic, you would use a distinct count.

Given your example, it is sufficient to run the query directly against the Customer dimension. There is no need to create another table to do that, such as a fact table. In fact it would be a bad idea to do that because you would have to maintain it every day. It is simpler just to run the query on the fly as long as you have time attributes in the customer table itself. In a sense you are using a dimension as a fact but, after all, data is data and can be queried as need be.

Calculated Measure using dimension

I am trying to build a calculated measure in SSAS that incorporates a dimension parameter. I have two facts: Members & Orders and one Dimension: Date. Members represents all the unique members on my site. Orders are related to members by a fact key representing a unique user. Orders also contains a key representing the vendor for an order. Orders contains a key to the date dimension.
FactMember
- MemberFactKey
- MemberId
FactOrder
- FactOrderKey
- OrderId
- FactMemberKey
- DimVendorKey
- DimDateKey
DimDate
- DimDateKey
- FYYear
The calculated measure I am trying to build is the number of unique vendors a member has ordered from. The value of the calculation must of course change based on the date dimension.

Wouldn't the DISTINCTCOUNT function be the one to use here? Creating a distinct count of Vendors could then be used in this query and elsewhere.
WITH MEMBER [Test]
AS
DISTINCTCOUNT([Vendor].[Vendor].[Vendor])
I will say in advance that this may well be slow (Depending on data volume/distribution), so if this query will be a popular/big part of the design it may be worth considering a restructure.

I am confused, it would make more sense to make Members and Orders both separate dimensions and then reference them from a FACT table, say Fact.Sales. This would eliminate the need to even build a calculated member if you keyed your Members dimension on some sort of member_key.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas