how can I find out which tabular models are being processed within SSAS? - ssas

We have many models on our SSAS instance and we made it easy for users to start the model processing (through a bespoke Excel add-in).
Now, a side effect is that there are apparently several Tabular models being processed at the same time.
Is there a way to find which tabular models are currently being processed?

You can use the tool SSAS Activity Monitor. The nice part about this tool is you can just launch it and in seconds be looking at all activity in SSAS.
You connect to your SSAS instance then use the Current Queries Panel. There will be a row for every query. When you process, the COMMAND_TEXT will start with <Batch...
and further in the COMMAND_TEXT you'll find the DatabaseID and process Type.
You could also query the DMVs directly. This query from SSMS found it as well, but it's harder to see there:
select * from $SYSTEM.DISCOVER_SESSIONS;
But you could put this SQL into Power Query and have it parse and filter down nicely.

Related

How to see how many people are accessing a ssas cube?

I wanted to know how many people are accessing a particular ssas cube. Please let me know the process to find them.
I don't think there's a simple way to get connected users from a cube. You can use a couple of different tools to collect that information (along with lot's of other information as well).
To get a list for users currently connected to a SSAS instance, you might use SQL Server Profiler. Then you can filter one cube from the resulting data.
For tracking users extended period of time, you can use Extended events
Both of these tools create data for a lot of data and you'll need to parse distinct users from that is some way.
To see the current sessions:
SELECT * FROM $System.DISCOVER_SESSIONS

What's the best method of creating a SSRS report that will be run manually many times with different Parameters?

I have a SSRS Sales report that will be run many times a day by users, but with different parameters selected for the branch and product types.
The SQL query uses some large tables and is quite complex, therefore, running it many times is going to have a performance cost.
I assumed the best solution would be to create a dataset for the report with all permutations, ran once overnight and then apply filters when the users run the report.
I tried creating a snapshot in SSRS which doesn’t consider the parameters and therefore has all the required data, then filtering the Tablix using the parameters that the users selected. The snapshot works fine but it appears to be refreshed when the report is run with different parameters.
My next solution would be to create a table for the dataset which the report would then point to. I could recreate the table every night using a stored procedure. With a couple of small indexes the report would be lightning fast.
This solution would seem to work really well but my knowledge of SQL is limited, and I can’t help thinking this is not the right solution.
Is this suitable? Are there better ways? Can anybody confirm either way?
SSRS datasets have caching capabilities. I think you'll find this more useful instead of having to create extra db tables and such.
Please see here https://learn.microsoft.com/en-us/sql/reporting-services/report-server/cache-shared-datasets-ssrs?view=sql-server-ver15
If the rate of change of the data is low enough, and SSRS Caching doesn't suit your needs, then you could manually cache the record set from the report query (without the filtering) into its own table, then you can modify the report to query from that table.
Oracle and most Data Warehouse implementations have a formal mechanism specifically for this called Materialized Views, no such luck in SQL server though you can easily implement the same pattern yourself.
There are 2 significant drawbacks to this:
The data in the new table is a snapshot at the point in time that it was loaded, so this technique is better suited to slow moving datasets or reports where it is not critical that the data is 100% accurate.
You will need to manage the lifecycle of the data in this table, ideally you should setup a Job or Scheduled Task to automate this refresh but you could trigger a refresh as part of the logic in your report (not recommended, but possible).
Though it is possible, you would NOT consider using a TRIGGER to update the data as you have already indicated the query takes some time to execute, this could have a major impact on the rest of your LOB application
If you do go down this path you should write the refresh logic into a stored procedure so that it can be executed when needed and from other internal and external automation mechanisms.
You should also add a column that records the date and time of when the dataset was executed, then replace any references in your report that display the date and time the report was printed, with the time the data was prepared.
It is also worth pointing out that often performance issues with expensive queries in SSRS reports can be overcome if you can reducing the functions and value formatting that is in the SQL query itself and move that logic into the report definition. This goes for filtering operations too, you can easily add additional computed columns in the dataset definition or in the design surface and you can implement filtering directly in the tablix too, there is no requirement that every record from the SQL query be displayed in the report at all, just as we do not need to show every column.
Sometimes some well crafted indexes can help too, for complicated reports we can often find a balance between what the SQL engine can do efficiently and what the RDL can do for us.
Disclaimer: This is hypothetical advice, you should evaluate each report on a case by case basis.

Power BI maxing connections to DB :( Can we populate multiple tables with single Sql.Database call?

I am assisting my team troubleshoot an issue with a Power BI report we are developing. We have a rather complex data model in the source SQL database, so we have created 5-6 views to better manage the data. We have a requirement to use DirectQuery, as one key requirement for the report is that the most up-to-date data in the database is visible, rather than having a delay in loading/caching the data. We also have the single data source, just the one database.
When we run the report, we see a spike of 200-500 connections to the database from the specific user for the report data source, and those connections don't close. This is clearly an issue and unsustainable for any product. We have a ticket open with Microsoft premium support to address the connections not closing, but in the meantime, I'm wondering if we're doing something wrong inside the report?
When I view the queries in the query editor, we basically have one query for each view, and it's a simple:
let
Source = Sql.Database(Server, Database)
query_view_name = Source{[Schema ......]}[Data]
in
query_view_name
(I don't have the raw code in front of me, but that's the gist of it.)
It seems to me, based on analytics in the database, that "Sql.Database" is opening a new connection every time this view is called. And with 5-6 views, that's 5-6 connections at a minimum; then each time a filter is changed, it's more connections, and it's compounds from there until the database connection pool is maxed out.
Is there a way to populate all the tables using a single connection to the database? Why would Power BI be using so many connections? Can we populate multiple tables in the advanced query editor? Using DirectQuery, are there any suggestions for what we can look at/troubleshoot/change in the report?
Thanks!
Power BI establishes multiple connections to the database to load multiple tables in parallel. If you don't want this, you can turn it off from Options->Current file->Data Load->Enable parallel loading of tables:
Keep in mind, that turning this option off most likely will increase the model loading time.
You may want to take a look at Maximum connections per data source option in Options->Current file->Direct query and the whole section Query reduction beneat it. Turning on Slicer selection and Filter selection on this page is highly recommended for cases like yours, but you need to train your users that they need to click on apply to see the results.
Ok.
We have a rather complex data model in the source SQL database, so we have created 5-6 views to better manage the data.
That's fine.
We have a requirement to use DirectQuery,
But now you're going to have a bad time. DirectQuery + complex views is a recipe for poor performance. Queries against your views will add joins, potentially across the whole model for filter context, as well as Measure and Calculated Column expressions. And these queries will change dynamically, based on the user's interaction with the report. So it's very difficult to see and test all the possible queries.
Basic guidance is to use import mode against views, and only use DirectQuery against properly-indexed tables. To address data freshness, you can replace the views with tables you load and keep up-to-date from your application, or perhaps use an Indexed View, etc.

Querying PowerBI data from MS-SQL

I finally decided to ask this (after a lot of google searching):
So we use Power BI for data visualization and thus in it are some calculated dashboards / data outputs which are used to monitor data quality etc. I want to be able to historical log these results so that over-time we can monitor progress i.e. was data quality improved. This is the end of the initial problem.
One approach to this problem was to connect to PowerBI from the MS-SQL side - hoping we can then set timed triggers to do the log by READING THE POWER-BI DASHBOARDS: So how do I query that (I have already developed a method to determine the connection using the Power-BI port as described here:
EXPORTING DATA FROM POWER BI DESKTOP TO MS-SQL
This is a screenshot from one of my MS-SQL connections through "Analysis Services":
I am assuming the objects named like "LocalDateTable_" are the actual BI analysis I want to query. "New Query" is an MDX type of Query. Should I go this route for my problem (logging powerbi analyses)?
At first this sounds crazy but on reflection I guess it was only a matter of time, and a sign of the maturity of Power BI solutions ...
I would use the SQL Server Profiler to capture the queries generated while you use your dashboard & report.
https://insightsquest.com/2017/05/07/profiler-trace-for-power-bi-desktop/
Then I would build an SSIS package to run the MDX queries and deliver the datasets to SQL Server, with extra columns e.g. StartTime.

The Pentaho BI Platform Workflow Issue

I have been working with Pentaho for the last few days. I have been able to setup the Pentaho Report Designer to generate a sample report by follow their documentation. Then I follow this article http://www.robertomarchetto.com/www/how_to_use_pentaho_report_designer_tutorial and managed to export the report to Pentaho BI server.
All I don't understand is Pentaho workflow. What should be the process I should follow which means what's the purpose of exporting the export to Pentaho BI server? Why there is a Data Integration tool? Why there is a BI sever when I can export the report from the Designer tool?
Requirement
All I want to do is retrieve the data from the MYSQL DB. Put them into a data-mart. Then from the data-mart generate a report.(According to what I have read, creating a data mart is the efficient way).
How can I get it done?
Pentaho Data Integration can be used to make this report generation automated.
In report designer you will be passing a parameter or set of parameters to generate a single report output.
With Data integration you can generate the reports for different set of parameters. for eg: if reports are generated on daily basis, we can make it automated for the whole month, so that there is no need of generating reports daily and manually.
And using the Pentaho Business Intelligence server we can make all these operations scheduled.
To generate Data/Table(Fact tables/dimension table) in MYSQL DB From difference source like files/different DB - Data Integration tool comes in to picture .
To create Schema on top of Fact tables - Mondrian tool
To handle user/roles on top of created cubes -Meta data editor
To create simple reports on top of small tables - Report Designer
For sequential Execution (at a go) usage of DI jobs/transformation , Reports, Java script - Design Studio
thanks to user surya.thanuri # forums.pentaho.com
The Data Integration tool is mostly for ETL, it's a separate tool and you can ignore it unless you are doing complex analysis of data from multiple dissimilar data sources. You don't need to 'export' reports to the pentaho server, you can write them directly to a directory then refresh the repository from inside the Pentaho web application. Exporting them is just one workflow technique.
You're going to find that there are about a dozen ways to do any one thing with Pentaho. For instance I use the CDA datasources with my reports vice placing the sql code inside my report. Alternatively you can link up to a Data Integration server to execute the Data Integration scripts to view a result set.
Just to answer your datamart question. In general a datamart should probably be supported by either the Data Integration tool (depending on your situation I don't exactly recommend this) or database functions/replication streams (recommended).
Just to hazard a guess, it sounds like someone tossed you a project saying: We need a BI system, here's the database where the data is stored, here are the reports we're already getting. X looked at Pentaho and liked it. You should use that.
First thing you need to do is understand the shape of the data, volume, tables, interrelations. Figure out what the real questions they want to answer are. Determine whether they need real time reporting, etc..etc. Just getting the datamart together itself, if you even need one, can take quite awhile. I think you may have jumped the gun on Pentaho itself.
thanks to user flamierd # forums.pentaho.com