I want to know can i use SSAS to replace traditional way of writing multiple complex stored procedures and getting o/p
for an example i have around 200 procedures having various requirements of business which i use in 200 SSIS packages(one procedure per package) to generate 200 flat files , can SSAS help me to reduce the no. of procedures and packages and still get the required 200 flat files
It all depends on what you are trying to achieve.
If you use OLAP you will need to rebuild the DataWarehouse each time (SSIS supports this though) and then you will need
SSAS is data warehousing. There are no constraints, and the MDX language is not what you would call well documented so there would be a learning curve to implements the queries
Read Why to build a SSAS Cube? for info on SSAS.
All in all you CAN use SSAS as a data source in SSIS. (look here) so it is a viable solution to query the Warehouse for your data. It is just not something I would do.
Related
I have about 10 tables that I use to create a SQL view. Some of these are pure lookup tables. And there are some SELECT clause columns that use Concatenation and other basic functions.
The data volume is expected to be about ~ 2 million rows.
I need to use the data from the SQL view to load into final tables using Azure Data Factory data flow.
Now I am just wondering if I should remove the lookup part from the view and add that to ADF flow. And also for the concatenation and simple functions in SELECTed columns in the view, I think I should maybe use Derived Column transformation in the data flow.
Just trying to understand what would the best practice be. And also, in terms of performance which one would be better?
Is it better to perform the required transformations and lookups in the view? That way the source data would be available in the required format and flow would be faster because not much transformation would be processed.
But if I have to make all the lookups and transformations in SQL view itself then what is the benefit of using a ETL tool like ADF?
I suggest you keep it all in the SQL and use a standard ADF data copy. It's easier to troubleshoot if you can log on to the SQL Server and run a view, vs, digging through transformations in an ETL tool.
the copy activity doesn't support any transformation but ADF data flows do.
However, ADF data flows have to spin up a databricks instance to do anything and that can take up to five minutes
what is the benefit of using a ETL tool like ADF
When you write your transformations in SQL you end up with a mass of code that can sometimes be difficult to maintain and audit
When you write your transformations in an ETL tool, you often get useful plumbing like logging and lineage and debugging. Also an ETL tool can more easily span multiple data sources, i.e. it can combine a text file with a database table.
Some people also prefer the visual aspect of ETL tools.
Many ETL tools also come with out of the box templates/wizards that assist in mundane tasks.
I have several SSAS cubes which are processed daily. This is done with a SSIS project which is scheduled to run daily at a set time. Sometimes we have ETL issues and my processing job chugs ahead without knowing that the underlying tables are incomplete. Depending on the nature of the ETL error, sometimes processing completes successfully, but with stale data; sometimes I get a SSAS error.
I want to optimize the processing jobs so that it will check/wait until the objects that a given cube is dependent on are done extracting before sending the process command to the cube.
SSAS is 2016, SQL server is 2014. The cubes are using Query binding in the data source view, and I am extracting metadata from the XMLA script that is generated during the deployment.
A potential solution I thought of is to parse out the query information in the XMLA's data source view node. That would complicate my deployment/metadata extraction process quite a bit, but would be worth it if there is no other way.
I have also searched through the DISCOVER_SCHEMA_ROWSETS DMV, and I could not find a DMV that would return these DSV queries, although I am not very familiar with the DMVs, so I may have missed one.
It also should be noted that I am lazy and do not want to duplicate development work by maintaining a index of dependencies separate from my SSAS projects. This solution needs to programmatically pull the data, and not rely on me keeping something updated because I am a unreliable human.
How do you ensure your cube processing starts after the dependent objects are extracted? Do you programmatically identify the SQL objects that a SSAS 2016 cube is dependent on? If so, how do you do it, and what are the pros and cons to your approach?
Microsoft introduced MDX for analysis services and since then few things have changed in the market place. Microsoft now have column store analysis services tabular and power pivot that run on DAX. Also database vendors have moved to in-memory (SAP Hana). I have long given up on MDX as unnecessary in the current DAX tabular environment, however SAP HANA excel pug-in now uses MDX to query HANA models and I'm trying to access if its worth learning MDX again.
Thanks
Using MDX is one of several options to query SAP HANA information models.
Standard SQL queries would do just as well.
MDX is mainly aimed at providing a common interface language to access data sources and return the data into multi-dimensional structures.
It also provides several language concepts not covered by SQL, e.g. hierarchy processing.
I've yet to see a user that would write his MDX statements for ad-hoc reporting by hand...
I work for a company that has a very mature and precise olap environment - MDX is 100% relevent.
We will start to look to move certain functionality into the Tabular/DAX world but I wouldn't imagine stopping MDX for a good while.
To me it is a very pretty declarative language - elegant and powerful - much more so than sql or what I've so far seen of DAX.
If sql is checkers(draughts) then mdx is chess!
I'm working on research project using Microsoft BI Stack and Oracle Database. When working with these two technologies I was able to use SSIS and SSRS.
My process was connecting Oracle database through SSIS and SSAS.
Yes, it has worked but SSIS is responsible for the ETL process. I was still able to create cubes in SSAS without using SSIS.
What's the difference between these two processes?
When using SSAS without SSIS, is it automatically invoking the SSIS process (ETL) behind the scenes?
If yes, what's the point of having SSIS?
SSIS is an ETL framework, which means it is designed to EXTRACT, TRANSFORM and LOAD data from one or multiple sources to one or multiple destinations.
SSAS is an OLAP (online analytical processing) that is designed to aggregate data from one or multiple sources for faster multi dimensional queries.
Both tools CAN be used together in the sense that SSIS can be used to build a datawarehouse on which SSAS will be building it's cubes, but in no way they are dependant.
You can also use SSIS to process your cube automatically (either full or partial).
To answer more clearly your questions : No, SSAS does not invoke SSIS.
yes, you can build cubes in SSAS directly against Oracle Data. Avoid SSIS entirely.. although most people prefer to copy the data into MS SQL because MS SQL is CHEAPER than Oracle.
I would like to know more about "MDX" (Multidimensional Expressions).
What is it?
What is it used for?
Why would you use it?
Is it better than SQL?
What is its use in SAP BPS (I haven't seen BPC, just heard that MDX is in it and want to know more)?
MDX is the query language developed by Microsoft for use with their OLAP tools. Since its creation, others (The open source project Mondrian, and Hyperion) have tried to create versions of it for use in their products.
OLAP data tends to look like a star-schema with a central fact table surrounded by multiple dimensions. MDX is designed to allow you to query these structures and create cross-tab type results.
While the language looks like SQL it doesn't behave like it and if you are an SQL programmer, the mental leap can be tough.
As to whether it is better than SQL, it serves a highly specialized purpose, i.e. analyzing data in a specific format. So if you want to query a star schema, it is better, otherwise, SQL will probably do the job.
MDX means Multi Dimensional eXpressions or some such. It is relevant to OLAP cubes and not to regular relational databases such as Oracle or SQL Server (although some SQL Server editions come with Analysis Services which is OLAP). The multidimensional world is about data warehousing and efficient reporting, not about doing normal transactional processing so you wouldn't use it for an order entry system, but you might move that data into a datamart to run reports against to see sales trends. That should be enough to get you started I hope.
SQL is for 'traditional' databases (OLTP). Most people learn the basics fairly easily.
MDX is only for multi-dimensional databases (OLAP), and is harder to learn than SQL in my opinion. The trouble is they look very similar.
Many programmers never need MDX even if they have to query multi-dimensional databases, because most analysis software forces them to build reports with drag-drop interfaces.
If you don't have a requirement to work with a multi-dimensional database, then don't create one just for the fun of it.....it won't be...
There are 2 versions of SAP-BPC (Business Objects Planning and Consolidation)
SAP-BPC Netweaver
SAP-BPC Microsoft Analysis Services
The Microsoft analysis services version of the product allows you to use MDX or multi dimensional expressions to both query the multi-dimensional database (OLAP) and write calculation logic.
However, SAP-BPC does not require a knowledge of MDX to either be used or administered.
You can see product documentation and a demonstration.
Best of luck on your research,
Focused on SAP BPC:
What is it used for?
It's used when you want to apply some custom calculation/business logic over many records/intersections and after submitting raw data. Example, first send prices in one input schedule, then quantities in other one, as a third step run a calculation for sales amount based on prices and quantities for all products.
It's also used to execute the Business Rules, for that you run a predefined program (like CALC_ACCOUNT, CONSOLIDATION, etc)
Is it better than SQL?
In BPC, "SQL" logic scripts have better performance than MDX. However SQL for BPC purposes has not much to do with SQL used in other it's just how they call it.
You will get a good start by just searching for MDX in the search box up top.