In a transition from Mondrian OLAP server to Microsoft's SSAS OLAP server, I'm trying to find out where the aggregation tables fit in. The database have various fact tables, each with accompanying aggregation tables for year, month and day.
How can one design such a structure in Visual Studio? The current project is "Multidimensional and Data Mining". In cube designer, cube structure only handles measures and dimensions. Aggregations seem to not take any additional tables as parameters. I've found no help online for this issue either, most of the information is about aggregation design. The design is already in place, but how to connect it to the cube?
In Pentaho Mondrian, this was simply done in the schema xml adding aggregation tables inside cube section.
Can anyone share some thoughts on this topic?
In an SSAS Multidimensional model there's no such thing as an aggregation table (unlike in Tabular, which I believe does support them).
In SSAS MD the only storage structure is the cube. The equivalent of aggregation tables is aggregation designs, which store pre-aggregated values, as part of the cube, but not in a way that is visible and separate, as an aggregation table is.
Since your legacy system does use aggregation tables, maybe the migration would be easier if it went to MS SSAS Tabular rather than SSAS Multidimensional?
Related
Microsoft introduced MDX for analysis services and since then few things have changed in the market place. Microsoft now have column store analysis services tabular and power pivot that run on DAX. Also database vendors have moved to in-memory (SAP Hana). I have long given up on MDX as unnecessary in the current DAX tabular environment, however SAP HANA excel pug-in now uses MDX to query HANA models and I'm trying to access if its worth learning MDX again.
Thanks
Using MDX is one of several options to query SAP HANA information models.
Standard SQL queries would do just as well.
MDX is mainly aimed at providing a common interface language to access data sources and return the data into multi-dimensional structures.
It also provides several language concepts not covered by SQL, e.g. hierarchy processing.
I've yet to see a user that would write his MDX statements for ad-hoc reporting by hand...
I work for a company that has a very mature and precise olap environment - MDX is 100% relevent.
We will start to look to move certain functionality into the Tabular/DAX world but I wouldn't imagine stopping MDX for a good while.
To me it is a very pretty declarative language - elegant and powerful - much more so than sql or what I've so far seen of DAX.
If sql is checkers(draughts) then mdx is chess!
I'm new to MDX. I understand that MDX is a query language, not a data transformation language. However, I'm also aware that this distinction is partially meaningless; there is no clear line between transformation and reporting, and every query language is capable of some transformation. Proficiency in a query language requires knowing what transformations are reasonable, and which require a redesign of the underlying schema.
From what I've seen of MDX, it clearly has features designed for creating calculated members within a dimension. Beyond that, however, I'm not clear on its transformation capabilities. Can anyone provide a concise summary of which types of transformations MDX can reasonably be expected to do?
I don't intend for this question to be limited to my particular reporting challenge. However, by describing my project, I can illustrate a few of the transformation types I'm interested in. So, here's a description of what I'm working on:
I need to use MDX to create some inventory and sales reports. I'm working with Microsoft SQL Server 2008 Analysis Services. The data is organized into three different cubes: On-Hand Inventory, In-Transit Inventory, and Sales. My reports require that the data be transformed in several ways. For instance:
1) I need to infer a "Months" attribute from the "Weeks" attribute, using the rules of a 4-4-5 calendar. I'm fairly certain this can be done elegantly with MDX.
2) I need to infer a "Calendar Month" dimension from the "Months" attribute. I believe this can be done with MDX, but I'm not sure whether there is an elegant solution or a kludge which should be avoided in favor of a schema redesign.
3) I need to infer a "Region" dimension from the "Warehouse" dimension. I've seen no evidence that this can be done in an elegant way by MDX.
4) I need to calculate total inventory as On-Hand Inventory plus In-Transit Inventory. From searching the web, it seems that querying two different cubes is possible, but is discouraged in favor of schema redesign, but the water is still very muddy.
I would say most of your requirements can be done with Analysis Services, but not necessarily with MDX. Rather, they would be done in cube design. This is normally done using the GUI, which is Visual Studio called BIDS (Business Intelligence Development Studio). If you absolutely want to use a language, you could use XMLA, which is how BIDS communicates with the Analysis Services server. But this would still not be MDX, and is not very well documented, and hence difficult to learn. You could use .net and AMO, but the easiest way is the GUI in BIDS.
And some of your requirements would optimally be implemented in the design of the relational tables on which the cubes are based. The first three of your requirements are best implemented in the dimension tables, and then just used in the dimension objects in the cube definition. For the fourth requirement, you are right, this can easily be implemented in a calculated measure in the cube calculation script. And this, indeed, is MDX.
In theory, you could also implement the first three requirements somehow in MDX. But this would be complex, difficult to maintain and have bad performance. MDX is just not designed for tis type of requirement.
I have been trying to learn hana these past few days and have been getting some problems. So As i see SAP HANA is used for de-normalization of data(as per some tutorials that i have seen). So i make the analytic views and I have my data denormalized after making the analytical views. What next?. How do I harness/use these views to create reports for business analysis. I need to generate several reports based on this de-normalized data(which i intend to ultimately use for a website based product). Do i need to create different Anaytical views for different reports?
HANA is not for denormalization of your data. You don't have to create aggregate and denormalized tables to speed up your analytics. In a normal analytics scenario you might build these but this will result in duplicate data, double maintenance to keep these up date etc..
Instead of this you can use your normal normalized database tables as a master/transactional data foundation and then build analytic views on these. How many views have to be created for different reports depends on your actual business needs, because views contain data in many aspects so they can be reused. In case of more complex reports you can of course create calculation views to get the exact data you need.
HTH
I recently inherited a warehouse which uses views to summarise data, my question is this:
Are views good practise, or the best approach?
I was intending to use cubes to aggregate multi dimensional queries.
Sorry if this is asking a basic question, I'm not experienced with warehouse and analyis services
Thanks
Analysis Services and Views have the fundamental difference that they will be used by different reporting or analytic tools.
If you have SQL-based reports (e.g. through Reporting Services or Crystal Reports) the views may be useful for these. Views can also be materialised (these are called indexed views on SQL Server). In this case they are persisted to the disk, and can be used to reduce I/O needed to do a query against the view. A query against a non-materialized view will still hit the underlying tables.
Often, views are used for security or simplicity purposes (i.e. to encapsulate business logic or computations in something that is simple to query). For security, they can restrict access to sensitive data by filtering (restricting the rows available) or masking off sensitive fields from the underlying table.
Analysis Services uses different query and reporting tools, and does pre-compute and store aggregate data. The interface to the server is different to SQL Server, so reporting or query tools for a cube (e.g. ProClarity) are different to the tools for reporting off a database (although some systems do have the ability to query from either).
Cubes are a much better approach to summarize data and perform multidimensional analysis on it.
The problem with views is twofold: bad performance (all those joins and group bys), and inability to dice and slice the data by the user.
In my projects I use "dumb" views as a another layer between the datawarehouse and the cubes (ie, my dimensions and measure groups are based on views), because It allows me a greater degree of flexibility.
Views are useful for security purposes such as to restrict/control/standardise access to data.
They can also be used to implement custom table partitioning implementations and federated database deployments.
If the function of the views in your database is to facilitate the calculation of metrics or statistics then you will certainly benefit from a more appropriate implementation, such as that available through a data warehouse solution.
I was in the same boat a few years ago. In my case I had access to another SQL server. On the second server I created a link server to the warehouse and then created my views and materialized views on the second server. In a sense I had a data warehouse and a reporting warehouse. For the project this approach worked out best as we were required to give access to the data to other departments and some vendors. Splitting the servers into two separate instances, one for warehousing and one for reporting also alleviated some of the risks involved in regards to secure access.
I would like to know more about "MDX" (Multidimensional Expressions).
What is it?
What is it used for?
Why would you use it?
Is it better than SQL?
What is its use in SAP BPS (I haven't seen BPC, just heard that MDX is in it and want to know more)?
MDX is the query language developed by Microsoft for use with their OLAP tools. Since its creation, others (The open source project Mondrian, and Hyperion) have tried to create versions of it for use in their products.
OLAP data tends to look like a star-schema with a central fact table surrounded by multiple dimensions. MDX is designed to allow you to query these structures and create cross-tab type results.
While the language looks like SQL it doesn't behave like it and if you are an SQL programmer, the mental leap can be tough.
As to whether it is better than SQL, it serves a highly specialized purpose, i.e. analyzing data in a specific format. So if you want to query a star schema, it is better, otherwise, SQL will probably do the job.
MDX means Multi Dimensional eXpressions or some such. It is relevant to OLAP cubes and not to regular relational databases such as Oracle or SQL Server (although some SQL Server editions come with Analysis Services which is OLAP). The multidimensional world is about data warehousing and efficient reporting, not about doing normal transactional processing so you wouldn't use it for an order entry system, but you might move that data into a datamart to run reports against to see sales trends. That should be enough to get you started I hope.
SQL is for 'traditional' databases (OLTP). Most people learn the basics fairly easily.
MDX is only for multi-dimensional databases (OLAP), and is harder to learn than SQL in my opinion. The trouble is they look very similar.
Many programmers never need MDX even if they have to query multi-dimensional databases, because most analysis software forces them to build reports with drag-drop interfaces.
If you don't have a requirement to work with a multi-dimensional database, then don't create one just for the fun of it.....it won't be...
There are 2 versions of SAP-BPC (Business Objects Planning and Consolidation)
SAP-BPC Netweaver
SAP-BPC Microsoft Analysis Services
The Microsoft analysis services version of the product allows you to use MDX or multi dimensional expressions to both query the multi-dimensional database (OLAP) and write calculation logic.
However, SAP-BPC does not require a knowledge of MDX to either be used or administered.
You can see product documentation and a demonstration.
Best of luck on your research,
Focused on SAP BPC:
What is it used for?
It's used when you want to apply some custom calculation/business logic over many records/intersections and after submitting raw data. Example, first send prices in one input schedule, then quantities in other one, as a third step run a calculation for sales amount based on prices and quantities for all products.
It's also used to execute the Business Rules, for that you run a predefined program (like CALC_ACCOUNT, CONSOLIDATION, etc)
Is it better than SQL?
In BPC, "SQL" logic scripts have better performance than MDX. However SQL for BPC purposes has not much to do with SQL used in other it's just how they call it.
You will get a good start by just searching for MDX in the search box up top.