Is SQL Server Analysis Services suitable for near-realtime KPI calculation?

Is SQL Server Analysis Services suitable for near-realtime KPI calculation? - ssas

I design a system that should calculate several KPI indicators. There is a relatively simple transactional database, data from this database are transferred to SQL Server data warehouse (1 facts table, 5 dimension tables). Fact table contains data about call-center phone sessions.
Near 1000 new fact table rows per second are being added to the data warehouse.
I need to build KPI indicators (e.g. average call length per call-center operator for last 2 hours). And the time from data update (new fact row) till KPI update should be less than 2 seconds.
Is it possible to build such fast solution using SQL Server Analysis Services?

Related

SQL Complex calculation and updating bulk records

Consider the below scenario,
I am having
Table A,
Table B and
Table C.
In these above tables needs to get updated their records as per calculation formula in respective columns on regular time interval.
(For example 24 hours once.)
So I tried in stored procedure and made in job scheduler by writing the queries to update the records.
When I am executing the procedure , it is taking 15 to 20 minutes to execute and complete SQL Server management studio getting hanged because of 5000 records in each table.
So I splitted the updation process into 3 seperate procedures and noted the time for execution as 1 minute for each procedure based on count of records.
Even After optimization of the procedure,
What kind of transaction needs to provide for updating the data for avoiding exceptions?
Is there any other possibility to retain the server from timeout issue?
My DB Plan Information:
I am developing Rental systems product.
In that Room Information will be available in a table 1.
Rented Customer Information will be available in table 2.
Rented room and its tenant invoice will be available in table 3
Rented room and its tenant receipt will be available in table 4
Step 1 : Need to calculate total due amount for all tenant, Dues in days,Dues in months for each invoice in a while loop for updating in the table records.
Step 2 : Need to calculate total late fee for each invoice on regular basis.
(Note: Late fee will vary based on per day calculation basis for same invoice.)
So I am storing into table 5 as common table for all leases with their respective calculated value on daily basis.
For that I have created a common stored procedure for calculating all data and Based on user defined function, It will update into table 5.
I am fetching from table 5 to all my reports and grid.
Problem statement:
Even I have optimized the procedure, When I am executing it, It is taking around 150 seconds approximately.
Question : How can I implement transaction for this action and How to make it in proper way by avoiding any deadlocks and other's data in the same table has to get accessed at the same time ?

you can go to the below link to change the timeout session for your SQL server management studio
https://learn.microsoft.com/en-us/sql/database-engine/configure-windows/configure-the-remote-query-timeout-server-configuration-option
And I suggest you keep your drive space free as much as you as can by compressing your ldf and mdf files.

if you are using cursor i have this suggestion
try change it to join (if its possible)
if you are using "transaction" and its out of your cursor and if its not damage your data try to bring it into the cursor and commit in cursor not after ending your query

Google BigQuery Schema Design for hourly / daily data

I am considering Google BigQuery as my data warehouse option. I have data in Google Cloud SQL, Google Cloud BigTable and exposed REST APIs on top of it to consume data on any UI. I am planning to use same APIs as source in my ETL job which will append data into BigQuery
From this API, I can get daily data. Lets take example - Total entities - 10,000, Measurements types associated with each entity - 1000. So for per year (single entry of each measurement per day) - 365 (no of days) * 10,000 (total entities) * 1000 (total measurements) = 3650000000 (around 4 billion)
Right now, I have 2 choices of schema design:-
Create single table - one entity id column and 1000 measurement columns
Use different table year wise (use sharding year wise) and later use UNION queries to fetch data
Please let me know which option would be best in terms of cost, scalability. I understand 2nd option would be more cost effective as it requires few table scans.
Is there any better choices available?

How to customize large dimension processing using ProcessAdd in SSAS?

We have a very large dimension in our SSAS. During the incremental run we are using ProcessAdd to process the dimension. This dimension processing is taking 95 % of the total cube processing time.
This dimension involves single table. The named query for the dimension from DSV is -
SELECT ABC, XYZ, DEF, PQR, PLADKey, LEFT(ABC, 3) AS DNL1, LEFT(ABC, 7) AS DNL2,
LEFT(ABC, 9) AS DNL3
FROM dbo.PLAD AS ad
The table has more than 33000000 rows that increases daily. Is it possible that due to high row count the processAdd is working slow. Does it automatically picks the news rows only or do we have to specify the filter criteria to identify new rows ( like adding a where condition to select only the data that is greater than last key value)?
We are using AMO to generate the XMLA script for processing. If we need to add filters how we do that in AMO?
We are working on SQL Server 2008 R2.
Any suggestions that could improve the performance for this dimension processing will be helpful.

If I understood your current state you ran a ProcessAdd on that dimension but didn't customize the query to just read the new rows? First, it is important to only do ProcessAdd on dimensions which are insert-only (no updates or deletes) in your ETL. If that's your case then I blogged about ProcessAdd here. See the "ProcessAdd Dimension 2008.xmla" example. It shows how to provide a SQL query that only returns the new rows.

SQL Server and NoSQL interactions

I have a table A that contains 500,000 rows. Table is related with 5 other tables (B,C,D,E,F) (the size of 100,000 rows each).
I'd like to extract in a fast way a structured information as (key A, value A+B+C+D+E+F) where value A+B+C+D+E+F is the column result of inner join of these 6 tables (bit slowly).
On these tables updates/insert/delete occur every 10 min (asynchronously).
I need to present result of this selection to perform a search as a binding source.
My IDEA
Materialize this selection in a NoSQL database where to perform search. My questions is:
which NoSQL database to use?
how to synchronize SQL Server to change the data in the NoSQL database any time my SQL Server record is changed?
SCENARIO
Desktop application made by C# and framework 4.0
Database SQL Server 2012

How to reuse process result of fixed data

In a financial system, transactions of every year is stored in a separate table. So, there are Transactions2007, Transactions2008, ..., Transactions2012 tables in the system. They all have the same table design. The data in tables of previous years never change. But current years data is updated in a daily manner.
I want to build a cube on the union of tables of all years. The question is how to prevent SSAS from reprocessing previous years.

When processing the Cube, you can set the cube process option to Process Incremental and then in the Configuration Dialog, you can select a query to select data only from the recent tables. Here is a link for more info.

I handled it by partitioning the cube (by time dimension) and processing only the most recent partition.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas