Clickstream measures at different granularities - ssas

This is possibly a simple problem which I have yet to overcome.
Consider a cube based on clickstream data.
First, I have a fact table based on page views. That is One Row per page view on a site. Measures includes [Views], [Visits], [Bounce Rate] etc.
Secondly I have a measure group based on Leads. Measures include [Leads], [Revenue], [Margi]n etc.
One page can create multiple leads, therefore a one to many relationship exists. OFF of this Leads Fact I also have a Leads Dimension which describe the lead. An example attribute may be [Quality] = Good / Bad.
Now when browsing the cube I might want to see the number of [Views] or [visits] against the Lead Attribute, [Quality]. The problem in a one [View] to many [Leads] is that the [Views] are incorrectly multiplied for each lead each created.
e.g. One [View] created 3 [Leads]. [views] by [quality] = 3
I want Views by Quality to be DIVIDED and = 0.33 or, ideally, still 1.
"1 View Created 2 bad and 1 Good Lead"
Would anyone have any ideas on how to solve this ?
THANK YOU !!!

Related

How to create metric in MAQL

I have a measure called "TimeSpent", which is time that the user spends in seconds on a certain page. I have an attribute called "Domain" which is the site the user visited, and I have the "Date". I want to create a stacked bar chart showing the sum of TimeSpent for each domain on each day. As there are too many domains, I try to filter the top 10. However, that returns the top 10 overall, and I would like to show the top 10 sites per each of the days. I tried many different things, but I am struggling. Could someone help me?
This really depends on your model and connection points, and also how that initial metric "TimeSpent" was built. At first thought you could try something like the following:
Select TimeSpent Where top (10) in (Select TimeSpent) By (connectionpoint) within (domain, ALL OTHER)
However, you may need to incorporate a Disconnected Date Dimension into the LDM in order to make this work in your use case.

MDX IIF statement to calculate new member basing on measure and hierarchy leaves

I have a simple Data Cube for computing shared expenditures with one measure Amount and some dimensions and hierarchies but interesting one would be Relationship. It describes who bought something for whom. It's structure is:
Who
For whom
Relationship key
I am trying to code a calculation representing debt. For example if I bought something for sharing usage it would be half of 0.5 * Amount. On the other hand, if I bought something for myself it would be 0 * Amount.
As far I tried following calculation:
IIF(
[dimRelationship].[Relationship].currentMember = [dimRelationship].[Relationship].[RelatonshipID].&[MeShared],
[Measures].[Amount]*0.5,
[Measures].[Amount]*0)
It works good only at lowest RelationshipID level. When I roll-up browsered pivot-table it is acting according to else-expression. That is not really surprising because hierarchy's currentMember is not MeShared anymore. Another bad thing that total aggregation work neither - it would be most important as a general summary. Is there any suffix like .LeafMember or something like that which could help me perform this calculation?
Thank you in advance!
Best regards,
Max!
In you fact table add another column, in this column store the "Amount" multiplied by the relationshipID for that particular row. This will address you issue right out the box
Try SCOPE statement.
Firstly, create new calculation [Debt Calculation] as [Measures].[Amount]*0 /* as ELSE-scenario */
Then create a SCOPE:
SCOPE([Debt Calculation],[dimRelationship].[Relationship].[RelatonshipID].&[MeShared]);
THIS=[Measures].[Amount]*0.5;
END SCOPE;
If it's the lowest level of hierarchy and is a dimension key (which is used as a link to measure groups), it will re-calc higher levels of this dimension automatically. Please post a result here if not.

Discritization Based on a Calculated Measure in Tabular Mode

I am currently trying to implement the following scenario on Tabular Mode SSAS, appreciate your support.
We have a fact table of Transactions that is the linked to the customer dimension, and we have a measure called Frequency that shows the number of times the user used his card during the selected period (The fact table is also linked to Date Dimension). What we need to do is create a dimension that would have the frequency groups as follows (For example, 1 to 5, 5 to 10 , 10 to 15 and 15 & Above). The problem here is that I am unable to link the Fact table to this dimension becuase the link between them would be a calculated measure.
Any thoughts?
Thanks and Best Regards
Omar Sultan
If you want to link the fact to a bucket dimension, you are going to have to specify the time granularity. I would suggest that you decide one or more useful periods (day, week, month) and create a facts (or several) to bucket your data at the appropriate grain.
This solution will lose flexibility from your original request, as the user will not be able to dynamically select the time period for the bucket, however they will gain from being able to compare fixed time periods to identify trends over time.

How can I summarize and reuse a complex dataset

How can I re-use a single complex dataset across a number of tables?
The dataset has a number of computed columns that needs to be reported both in detail and in summary. Here's a very simplified example dataset:
is_food sale_association food_type total_sold total_associations percent_total
1 Before Movie Popcorn 50 3 x BirtMath.safeDivide(...)
0 Before Movie Soda 10 2 x BirtMath.safeDivide(...)
1 During Movie Jujubee 10 1 x BirtMath.safeDivide(...)
0 After Movie Soda 15 2 x BirtMath.safeDivide(...)
From this one dataset, I'd want to create a detailed summary of all food types while rolling up non food (using the 'is_food' column), another summary of all food types, another detailed summary of food with rolled up non-food by sale_association, etc. etc.
The report would also contain a number of percentages (6 in the most complex table) that need to be calculated (some across a row, others across all rows in a given group), all of which can have a zero value for the denominator and so need to be guarded against with safeDivide (which is a PITA to do in the source SQL query which itself is doing aggregation -- checking for divide by zero when both the numerator and denominator are sums leads to hairy queries).
Obviously I can do this by focusing the() SQL query as appropriate, but it seems like a waste of time and effort to create 12 or 15 queries that are very similar when I've already managed to create the monster query for the most detailed table.
What doesn't seem straightforward is how to perform the rollups in a table. I managed to hack something together by hiding rows that would later be summed up (e.g. "is_food == 0" in the example) and then creating custom data bindings that are displayed in a footer row. Not only does it feel like a hack, it also interferes with the ability to naturally order rows. Again, going back to the example, if I was ordering by total_sold and summarizing rows with is_food == 0, the natural order should be Popcorn, Non-food, Jujubee.
There's nothing in the BIRT wiki about this, nor does "BIRT: A Field Guide, 3rd E." really delve into the topic.
This seems like a fairly open-ended question (although I agree that re-using a single dataset makes much more sense than having multiple queries retrieving the same data in slightly different ways). A few general suggestions:
Use the most detailed version of the data required as a common dataset for each BIRT report item (typically BIRT tables)
Where summary-only level reporting is required, add groups to the BIRT table at the desired level, add data items as required to the group headers/footers and delete the detail level row(s) from the BIRT table.
Where detail-level reporting is required in some cases (eg. for food items but not for non-food items), add groups to the BIRT table as above, and set the visibility of the detail row (in Property Editor - Properties - Visibility) to check Hide Element, then specify the appropriate expression to suppress the non-required rows (non-food items, in this example).
Aggregations (ie. summary expressions) can be added to tables by selecting the whole table, selecting the Binding tab within the Property Editor and clicking the Add Aggregation... button.

Implicit Fact division and Dimension Usage

I have a star schema with Implicit Fact division as shown in Figure 5 at http://www.information-management.com/infodirect/20020308/4858-1.html?pg=2.
My question is how do I set up the Dimension Usage? My first thought was to set up 3 Referenced Relationships (CustomerGroup to InvoiceItemFacts, GroupToCustomer to CustomerGroup, CustomerDimension to GroupToCustomer), but when I try this I get the message "A loop was found in the data source view at the 'dbo_CustomerGroup' table".
Update:
I have found that if I create a Regular Relationship between GroupToCustomer and InvoiceItemFacts (effectively by passing the CustomerGroup table because I already have the Customer Group Key) I can get some results. However, when I browse the cube and display the InvoiceItemFacts by Customer, the InvoiceItemFacts only display on the first Customer in the group.
GroupToCustomer looks to be a "Fact-less Fact Table", so you would create a measuregroup on that, doesn't need to be visible to the end users, then do a Many-Many join via that fact in the dimension usage tab.
It's a little complicated by the extra table in the way, but that should be the approach.