MDX request to show number of items sold by type? - mdx

I have one data warehouse table that contain one row for each item sold.
Each row contains the item's type.
What MDX request could show the number of items sold for each item type?
What (dimensions,levels,etc) would it suppose to create?
In case it is relevant, I am using Pentaho/Mondrian/Spoon/Schema Workbench.

When you build a cube from the data warehouse you would typically aggregate rows for each product sale into totals for groups of one hour, or one day, per product. Few big cubes would support drilling down to individual product sales.
After creating a [Product] hierarchy/dimension you would create a virtual dimension based on that, using the item types, to give another way of breaking the information down.

Related

Designing a cube

I’ve been asked create our analysis cube and have a design question.
We sell ‘widgets’ and ‘parts’ to go with those widgets. Each order has many widgets and sometimes a few parts.
What I’m stuck on is – to me, an order is a fact in a measure. But, what are the widgets? Are they a dimension and each fact in the measure will be an entry for every part and widget for the order.
So, if order 123 had widget 1 and widget 2 and part 5, then there will be 3 facts in the measure for the same order? Is that correct?
At its basic level you can consider most facts to be transactions or transaction line items. So, for example, you may have a 'sales' fact table in which each record represents one line item from that sale. Each fact record would have numeric columns representing metrics and other columns joining to dimension tables. The combination of those dimensions would describe that line item. So, in your case, you likely have something like:
1) A 'date' dimension detailing the date of the transaction
2) A 'widget' dimension detailing the widget sold on that transaction
3) A 'customer' dimension detailing the customer who bought that item (almost certainly the same customer would appear on every line item for this transaction)
4) ... determined by what information you have and what business problem you're trying to solve.
Now, the dimension tables contain further details. For example, your widget dimension table likely contains things like the name of the widget, the color, the manufacturer, etc. Every time your company sells one of these widgets, the record in the fact table links to that same dimension record for that name, color, manufacturer, etc. combination (i.e. you don't create a new dimension record every time you sell the same item - this is a one-to-many relationship - each dimension record may have many related fact records).
You other dimension tables would similarly describe their dimensions. For example, the customer dimension might give the customer's name, their address, ...
So, the short answer to your question is that widget likely is a dimension, items and widgets may (or may not) actually be the same dimension (in a school class I suspect that they are), and that you would have 3 fact records for that one transaction.
This is probably along the same lines as the prior answer but....
If you try and model "many widgets per order" you'll have issues because you end up with a many (order fact) to many (widgets) relationship. In a cube / star schema design, many to many relationship usually need to be moddeled out to be many to one in some way.
So what you do is try and identify what special thing identifies an "order" (as opposed to a bunch of widgets in an order). Usually that is simply stuff like order date, customer, order number, tax
An example way to model this is:
If you have a single order with five widgets, you model that as a fact table with five records that happens to have a repeating widget, customer, date etc. in it
Then you have to work out how you spread an order header tax amount over five records. The two obvious solutions are:
Create a widget that represents tax and add that as another record
Spread the tax over five records, either evenly or weighted by something
Modelling "parts" just takes these concepts further.
It is important to understand what the end user wants to see, why they want to see parts. What do they want to measure by parts, how do you assign higher level values (like tax) down to lower levels like parts.

Comparing 2nd largest item per group to the largest item in the group using SQL

I have a relational database for a Burger Building application that a restaurant uses. Two of the tables contained in the DB are Category and Item. These are used to display the categories and then the customer can select a category (E.G. Buns) and view all of the children contained in that category and choose which ones to add to their order. The two tables are linked using a field called CategoryID.
The Item database contains amongst many, the following fields: ItemID, ItemName, TimesOrdered, CategoryID.
One of the required functions is to view the item that has been ordered the most (most popular) per category. This can be retrieved from the TimesOrdered field. However, if two items have been ordered the same amount of times, then there is technically not any item in that category that has been ordered the most.
Therefore, the largest TimesOrdered field will have to be compared to the second largest TimesOrdered field to determine if any items have been ordered the most for that category.
Is there any way to achieve this using SQL? For example, showing the ItemID for each category (using Grouping on CategoryID) that has been ordered the most as long as the item that has been ordered the second most has been ordered less times than the item that has been ordered the most.
I know that this can obviously be done by simply viewing the first two items and comparing the second record's TimesOrdered field with the first record's TimesOrdered field, but as a challenge and a way to improve my SQL, is their any way to get the desired results by using a single SQL statement?
Thanks in advance for any responses :)
Would it be possible to share some sample data? For example, what types of records are in your Item table?
How specifically is your Item table related to your Category table? Do you have multiple items per category?
I'd also want to know how the TimesOrdered field gets updated. Is this something that is updated manually by a user whenever that item is ordered, or handled by code?
Regarding the output: It sounds like you want to display, by category, the item with the most orders. Is this correct? If so, would it be displayed via a query the user runs? It sounds like you want to display something different for categories with multiple items having the "max ItemCount" for that category. If a given category has multiple items with max ItemCount, what should display for that category? Could you provide some sample output of what you're expecting to see?
I'm thinking the best way to handle this would be to use multiple sub-queries, which can get rather hairy in Access. It might be best to break this into separate queries in Access, which you can progressively select from
Create a query Q1 that shows the max TimesOrdered for each category.
Create a query Q2 that uses Q1 to figure out how many items for each category have the max TimesOrdered value.
Depending on how you want to display the final results, you could create a new query, Q3, that either shows NULL for the item in that category (if there's a tie), or the appropriate item. Basically, you'd display the item from each category where the TimesOrdered matches the max TimesOrdered for that category (having to possibly do special handling for categories with ties).
Another thing you might want to think about: What about having a separate Orders table that stores details of each order, rather than having a TimesOrdered field? Of course, that would complicate your queries further, but give you more data to report on.

SSAS Cartesian product between dimension when no measure selected

I'm testing a Multidimensional model by using Excel. Simplifying, I've two dimension tables, products and categories, and a sales fact table. There is a relationship between products and categories and between sales and products.
When I analyze this model in Excel and I put as row label products and categories, without putting the sales amount as a measure, it seems it occurs a cartesian product between product values and category values without the corresponding relationship has any effects.
This is for me an undesirable behaviour respect to the user's point of view. An user could want to navigate the structure of the model without initially to select any measures. So if a category is linked to one or more products, the selection of these two dimension tables must show the rigth data combination and not a cartesian products.
Now, how can I solve this issue, please? Thanks
You should consider uniting Products and Categories into one dimension. As you said, Categories are related to Products itself, and facts are linked to Products. By uniting, you can view sales divided by Category-Product hierarchy, and have a tree-like view on Excel filter.
On your [Dim Product] add Category table and draw a relation between tables. Add fields from Category table and create hierarchy Category - Product. Here is a sample of similar design.
Two different dimensions should be used if objects are really independent which is not your case.

Oracle SQL (Multiple stock items against one product id)

Might be incredibly naive but it's troubling me nonetheless. As part of my coursework, I have to create a database system for imaginary customers purchasing an item that firstly needs to be made. One problem I'm having is allocating multiple stock items into the works orders table against one particular product ID. So for example, how would I get 4 legs and a table top which would come under two individual ID's (legs and table top) against the product id containing 'table'? I'm currently working with SQL plus.
Image of my ERM is here for clarification

Cube Combining Facts

My goal is to generate a report that has by day: order count, quantity sum, receive sum (we ship our product out and get it back). I have two cubes now: one for the orders and one for the receives. I've created a new cube that includes both dimensions but they are not in sync by date. It's as if there is not a join between the two fact tables by the date. Any ideas?
Thanks
In one cube, include both your orders and receives Measure Groups, and one instance of your Date dimension. In the cube's Dimensions Relationships tab, relate both Measure Groups to the Date dimension using their relevant date keys.