I'm wondering if it's possible to limit the CPU/Memory used by a specific SSAS cube. This needs to be applied to a specific usergroup. Similar to the Resource Governor we see on SQL Server.
This is primarily to prevent users raising expensive queries.
Best
Lee
Related
What are the key differences between OLAP and OLTP databases.
Specifically in terms of implementation (rather than use cases).
OLAP is of course primarily used for reporting while OLTP is used for handling transactions.
I understand that OLAP databases are optimized for read over write, and that OLAP databases contain more denormalised data.
What other characteristics set the two apart?
OLTP:
As the name suggest "Online Transaction Processing", this is used for more transaction needs like "INSERT/SELECT/UPDATE/DELETE".
Low Response Time.
There are the original source of data.
Usually data is stored in 3NF form.
ACID properties are necessarily followed.
OLAP:
As the name suggest "Online Analytical Platform", used for analytical queries and in general are used for complex analytical queries and drawing inferences.
Periodic batch processing jobs are run here.
Typically de-normalized with fewer tables; use of star and/or snowflake schemas.
NOT necessarily follows ACID properties.
There are many difference. You may find tons of answers by googling this question. But some of the characteristics which are derived from practical implementation from my own experiences are:
OLTP is business domain specific system designed to perform specific tasks for example an eCommerce website having a database for handling online order while another OLTP database is being used for back end operation for order processing another OLTP database is for logistics etc. Whereas OLAP systems are designed to look at the information at whole business level by sourcing data from many heterogeneous system.
If I simplified the above example then OLTP is small units of Business Processing system while OLAP system is a large unit of Business Information.
You can refer this link for more clarification.
One of our customer is in manufacturing domain. He has multiple factories across the country. For the quality control, he is using window application deployed independently in all factories (approx 100 in count). Our customer is interested in replacing all the window applications with a single web application. Now the problem is volume of the data will be 100 times bigger and same as the velocity (in case we keep a single database for all the factories). There are lots of reporting use cases in this domain. Looking at the numbers, it looks like SQL will be not be able to handle this much load.
Is it a valid use case to move to NoSQL database?
Can Volume/Velocity alone be a deciding factor to move to NoSQL?
Would we be able to get all those reporting from NoSQL database efficiently?
Any kind of help will be appreciated.
Thanks In Advance
This is a usefull discussion.
In my opinion a well designed MS-SQL server 2012 (or Oracle server, but no experience for me) must be capable of handling 1000 complex transactions per second.
MS-SQL server 2014 with in-memory processing raises even higher expecations.
Consider multi processor, large memories, table partitioning, file mapping, multiple access paths to the SAN or to separate server discs. Use well designed transactions (consider to remove most indexes on transaction tables).
As an extra benefit you keep all functionality of the SQL server. In my opinion most NOSQL solutions are NOSQL because they are deprived of essential SQL functionality.
Switch to NOSQL databases is most usefull when you require functionality outside the transaction domain, e.g. document indexing or network indexing.
I have read all those article about datawarehouse and olap.....however I have some question on it
I have created a datawarehouse using mysql and I also created an API which contain ad-hoc query to query from the datawarehouse, so is this API consider as ROLAP?
Is it possible to create own OLAP? If yes, how?
Usually data warehouse has normalized structure and DWH is not the same as ROLAP.
ROLAP it is technique used to modeling data. ROLAP is usually used for reporting. ROLAP is very good to make analytical query and you can use many reporting (BI) tools to easily build reports on you data.
It isn't necessary to write you own application to build reports. ROLAP (relational OLAP) it is when you model you data as "star" or "snowflake" using facts and dimension tables in traditional RDBMS. It star schemas also called "multidimensional cubes".
By OLAP often is meant MOLAP (multidimensional OLAP) - it's when you really store your data in multidimensional data structure in special data stores (not in RDBMS).
You shouldent create you own MOLAP e data storag- you should use alredy developed OLAP servers like MANDARIN, Pentaho Olap,Essbase, ORACLE EE database with OLAP option.
The confusion you are pointing out comes from the fact that peoples tend to use this term anywhere and in a wrong context.
OLAP applications are precisely defined by the OLAP council. These are applications that fullfill a bunch of requirements. You can read these requirements Here.
In big words, these are analytical oriented applications that allow you to build reports in an a multidimensional fashion (it means you have dimensions and indicators that you can cross) and get fast anwsers at enterprise-scale, with drill down and drill accross capabilities. Something close to OLAP applications is this : http://try.meteorite.bi/
Building an adhoc reporting engine on top of a datawarehouse doesn't mean you have an OLAP application. Does it have a multidimensional shape ? Is it user oriented ? Is it fast enough ? It has to answer yes to all these questions and the ones below to be a candidate to be an OLAP application.
I'm new to Analysis Services
My first cube has been deployed and it seems to work.
Dimension tables are ok and fact tables are ok.
My question is very simple : If I add a new record in the related datasource table,
Browsing the cube, I don't see the new record until process again the cube.
In my mind I think if new records are addedd, then cube must reflect the changes.
How to solve this issue? Do I need to reprocess the cube every time a new record is added? This is impossible of course.
You understand that essentially your cube represents a bunch of aggregated measures? That means that when the cube is processed it looks at all the data that is in your fact tables and processes the Measures (according to the dimensions).
The result of this is that you're able to access the data in the cube quickly and efficiently. The downside is as you have mentioned is that when new data is added to the fact table the cube isn't updated.
Typically there will be a daily batch job that will update the cube with the latest fact data, depending on the amount of data you have and the "real-time" requirements this could be done more than once p/day. A lot of people do this out of hours.
If you look closely in BIDS you will notice on the Partitions tab that for each partition it has a Storage Mode which you can define.
I would recommend you read this this article http://sqlblog.com/blogs/jorg_klein/archive/2008/03/27/ssas-molap-rolap-and-holap-storage-types.aspx
Basically, there are a few different modes you can use:
MOLAP (Multi dimensional Online Analytical Processing)
MOLAP is the most used storage type. Its designed to offer maximum query performance to the users. Data AND aggregations are stored in optimized format in the cube. The data inside the cube will refresh only when the cube is processed, so latency is high.
ROLAP (Relational Online Analytical Processing)
ROLAP does not have the high latency disadvantage of MOLAP. With ROLAP, the data and aggregations are stored in relational format. This means that there will be zero latency between the relational source database and the cube.
Disadvantage of this mode is the performance, this type gives the poorest query performance because no objects benefit from multi dimensional storage.
HOLAP (Hybrid Online Analytical Processing)
HOLAP is a storage type between MOLAP and ROLAP. Data will be stored in relational format(ROLAP), so there will also be zero latency with this storage type.
Aggregations, on the other hand, are stored in multi dimensional format(MOLAP) in the cube to give better query performance. SSAS will listen to notifications from the source relational database, when changes are made, SSAS will get a notification and will process the aggregations again.
With this mode it’s possible to offer zero latency to the users but with medium query performance compared to MOLAP and ROLAP.
To get the real-time reporting without having to reprocess your cube you will need to try out ROLAP, but beware, the performance will suffer (depending on the size of your cube and server!).
I'm a few months into developing a reporting solution. Currently I am loading a relational data warehouse (Fact and Dimension tables) using SSIS. SSAS cubes and dimensions are then created from the relational Data warehouse. I then use SSRS to build reports using MDX queries.
The problem I have is that things are starting to get rather complicated trying to understand how multidimensional modelling works as well as MDX and cubes. Since the organization it's being designed for is rather small, I'm thinking that I should re-evaluate my approach.
I think maybe I should just eliminate SSAS from the picture and simply create reports that report directly off the relational data warehouse using SQL queries. The relational data warehouse could still be loaded nightly to allow up to date data for reporting.
I'm just wondering if that would be a good idea considering I'm not very experienced with data warehousing and SSAS. Also I wanted to know if keeping my relational data warehouse in dimension and fact tables would still work with SQL queries or would I need to redesign the tables. I don't want to make the decision to eliminate SSAS if that will end up causing more headaches or issues.
The reports will not include complicated calculations besides row counts and YTD percentages. For example "How many callers were male?" and "How many callers called for Product A?" Which are then broken down by month.
Any comments or suggestion are much appreciated cause I'm starting to feel rather frustrated with trying get SSAS cubes developed properly.
I was in a similar situation at my company. I had never used SSAS, and I was asked to do research on the benefits of using cubes to do some reporting. It was a pretty steep learning curve because my background is in development not data and reporting. SSAS is most useful when aggregate queries on a relational database are time consuming and if reports need to be broken down into hierarchies that an analyst can use to better understand the state of the business. Since SSAS stores aggregate info, queries of that nature are very quick. If your organization's data is small, the relational queries might be quick enough that you don't really need the benefit of storing aggregates.
Also you need to take into consideration the maintainability of using SSAS. If you're having trouble figuring out SSAS and MDX then how easy of a time will others? I tried to explain an MDX query I wrote to my boss who is experienced with SQL, but it's really quite different from relational queries. How easy is it going to be to add more complex reports?
A benefit to using SSAS is it can put the analyst in control of the report. Second, there are great tools and support. Finally, it's pretty easy to deploy and connect.
You can remove SSAS from your architecture yes because all the results you can get from an MDX query to SSAS, you can get from a T-SQL query to your datawarehouse because the cube was built reading data from the DW. BUT, bear in mind the following: the main advantage on an OLAP cube, in my point of view, are aggregations.
Very simple explanation: lets say you have a fact table called orders with 1 million orders per month. If you want to know how much you sold on that month, using sql you need to read row by row and sum the value to produce the total. That's like 1 million reads on your DB. If you have a cube, with the propper agrregations configured, you can have that value pre-calculated and pre-stored on your cube so if you need to know how much you sold on a month, you will have only one read to your cube.
Its a matter of analyse your situation, if you have a small cube, maybe aggregations are not necessary and you cna do fine with SQL, but depending on the situation, they can be very helpfull