SSMS displaying different messages for the same action - ssms

I have a few cubes configured in the same way, and when I process them; all of them run ok:
While they run it says "Processing in progress" for a minute or two, and then it finishes...
But I have a new cube, with all the exact same configurations, and instead of "Processing in progress" I always read "Retrieved xyz rows....", with a growing count of rows:
Why is this different? Why when I process this cube I see the count of rows?

Related

report scheduler system design using database as master

Problem
we have ~50k scheduled financial reports that we periodically deliver to clients via email
reports have their own delivery frequency (date&time format - as configured by clients)
weekly
daily
hourly
weekdays only
etc.
Current architecture
we have a table called report_metadata that holds report information
report_id
report_name
report_type
report_details
next_run_time
last_run_time
etc...
every week, all 6 instances of our scheduler service poll the report_metadata database, extract metadata for all reports that are to be delivered in the following week, and puts them in a timed-queue in-memory.
Only in the master/leader instance (which is one of the 6 instances):
data in the timed-queue is popped at the appropriate time
processed
a few API calls are made to get a fully-complete and current/up-to-date report
and the report is emailed to clients
the other 5 instances do nothing - they simply exist for redundancy
Proposed architecture
Numbers:
db can handle up to 1000 concurrent connections - which is good enough
total existing report number (~50k) is unlikely to get much larger in the near/distant future
Solution:
instead of polling the report_metadata db every week and storing data in a timed-queue in-memory, all 6 instances will poll the report_metadata db every 60 seconds (with a 10 s offset for each instance)
on average the scheduler will attempt to pick up work every 10 seconds
data for any single report whose next_run_time is in the past is extracted, the table row is locked, and the report is processed/delivered to clients by that specific instance
after the report is successfully processed, table row is unlocked and the next_run_time, last_run_time, etc for the report is updated
In general, the database serves as the master, individual instances of the process can work independently and the database ensures they do not overlap.
It would help if you could let me know if the proposed architecture is:
a good/correct solution
which table columns can/should be indexed
any other considerations
I have worked on a differt kind of sceduler for a program that reported analyses on a specific moment of the month/week and what I did was combining the reports to so called business cycle based time moments. these moments are on the "start of a new week", "start of the month", "start/end of a D/W/M/Q/Y'. So I standardised the moments of sending the reports and added the id's to a table that would carry the details of the report. - now you add thinks to the cycle of you remove it when needed, you could do this by adding a tag like(EOD(end of day)/EOM (End of month) SOW (Start of week) ect, ect, ect,).
So you could index the moments of when the clients want to receive the reports and build on that track. Hope that this comment can help you with your challenge.
It seems good to simply query that metadata table by all 6 instances to check which is the next report to process as you are suggesting.
It seems odd though to have a staggered approach with a check once every 60 seconds offset by 10 seconds for your servers. You have 6 servers now but that may change. Also I don't understand the "locking" you are suggesting, why now simply set a flag on the row such as [State] = "processing", then the next scheduler knows to skip that row and move on to the next available one. Once a run is processed, you can simply update a [Date_last_processed] column, or maybe something like [last_cycle_complete] = 'YES'.
Alternatively you could have one server-process to go through the table, and for each available row, sends it off to one of the instances, in a round-robin fashion (or keep track of who is busy and who isn't).

Dashboard shows 79 Succeeded jobs but where is the list?

When I select the "Succeeded" menu in the dashboard, I only show a single job, yet there is a number beside the text "Succeeded" indicating the number of jobs that have executed without error. How do I see those?
Hangfire automatically clears jobs after a certain amount of time (usually 1 day by default)
The number beside the succeeded is the total number of successful jobs since the beginning.
See this answer from the Hangfire forum

Combine the Output of 3 Transformation in Pentaho

I'm executing 3 transformations in parallel. the o/p of three transformation contains same column names.
I've added output of all transformation to common dummy step in job and also added WaitForSql step to wait until all 3 transformations have completed execution, and also added unique step in next transformation to remove duplicate records.
All works proper till WaitForSQL, but when next transformation gets rows from result and performs Unique step I get duplicate records also when I perform Unique step.
Has anyone solution for this issue, plz reply.....
You have to sort your resulting stream after the dummy step before removing the duplicate rows. The sort will also make sure that all 3 streams are completed before sorting.
I didn't know you could use the dummy step to combine stream results. I always used the append streams-step for that.
Several points:
You cannot simply combine the outputs of multiple transformations at the job level. You wil need another transformation to read the data using the Get rows from result; jobs don't know about data streams, they only know about tasks (job entries) and exit status.
Be careful with "Launch next entries in parallel" at the job level. Lets say you have 2 transformations, trans1 and trans2, launched in parallel, followed by a dummy step. The dummy will be called TWICE, once after trans1 finishes and another when trans2 finishes. A job hop is not a data stream, it's a workflow. If you want to run transformations in parallel and later go back to a single workflow you need a subjob that calls the transformations and doens't have a Success job entry. That way, the subjob only finishes after the 2nd transformation finishes and only then it goes to the dummy step in the parent.
Why do you need those transformations running inside a job? If they have the same column structure, why don't you call them as sub transformations inside a transformation, and not a job? Steps in a transformation are always launched in parallel, so if you're parallelizing things for performance, a transformation is the way to do it, not a job. A job is meant to run multiple tasks in sequential order, one after the other, with workflow control depending on the result of the previous step.
If you want the Output from three of them into a single file, then you could run the same 3 instances in a single transformation with the append to target option ticked in your output step

SSRS Data-Driven Subscription [based on static Subscription table] Not Picking Up Changes Made to Subscription Table

I have a .RDL report which I designed in BIDS and have deployed to my report server. The report asks for three parameters before viewing report: Year, Month and Customer ID. The report works great and does exactly what it is supposed to.
While I used to run each report individually because there were 2-3 customers, now there are 30+ customers who receive the report, so I wanted to switch to a more automated fulfillment method to get the reports generated. After doing some research it appears that a using Report Manager to create a "Data Driven Subscription" (DDS) using the "Windows File Share" option gives me the capabilities I need.
As part of creating the DDS, I created a table called [Subscription] which is a table containing one row for each customer receiving the report and has the following columns:
Year
Month
CustomerID
FileName
FileLocation
Overwrite
Format
...so through using the DDS Wizard in Report Manager, I was able to successfully set up a Data Driven Subscription (which is linked to various columns in the [Subscription] table) which creates a new report for each customer in the [Subscription] table, saves [and overwrites, if necessary] it in a location of my choosing as a PDF (specified in [Subscription].[FileLocation], or the FileLocation column of my table for each row), and runs every minute (I plan on changing frequency to once a week, eventually).
This works flawlessly, giving me a new set of 30 reports in the directory of my choosing, with each report having a name I assigned in the FileName column of my table. Exactly what I was looking for.
HERE'S THE PROBLEM: When I update the FileLocation or FileName (or anything, really) in the [Subscription] table - it doesn't pick up the changes right away. Sometimes it doesn't even pick it up at all (for example I updated the [ReportName] column for one customer from Report_711622 to SpecialReport_711622, so that the output file for that customer should be named SpecialReport_711622 while all of the other reports should be called Report_XXXXX [no Special prefix]. But the file name of report for Customer 711622 remains the same!
It's almost like the job only see's what it needs to do once a day, and then does not go back and reference the [Subscription] table until I leave for the night, then when I come back in the morning it picks up the change.
Since I am about to scale this process out to a large customer-base using a different report, I need to be able to make edits to the [Subscription] table and have them get picked up by the Data Driven Subscription immediately (and if not immediately, at least a fixed interval of time that I can adjust, so that I can know 100% when the change will get picked up).
Does anyone know what's causing my lag? How do I change it so that updates to the Subscription table get picked up regularly? I'm also having issues with creating new DDS on other reports (following the exact process outlined above) - I've created the subscriptions, for every minute, and it says they are running and the number of outputs match the number of customers with 0 errors, but there are no files in the drive I specified (or anywhere else I've looked, for that matter).
Any help would be greatly appreciated!
I think the answer lies in the mechanism SSRS uses. There are a few places "lag" can occur.
The subscription is in fact an SQL Agent job which creates a record in the Event table. This table is a queue that SSRS checks to do scheduled tasks.
There is a small amount of time between the moment the subscription creates the Event record and the moment SQL reads it and starts creating the dataset for your DDS. The creation of the DDS dataset takes some time, too. In this time, the subscription will be in the Pending state. If you change anything in the data during this time, The subscription will still use the old data as report parameters. So obviously you will not notice your change until the next scheduled run.
Which brings me to the following: if a subscription is still being run and the next schedule kicks in (chances are, because yours runs every minute), the engine will not execute it, but wait for the next subscription schedule, and so on. So that's another possibility of lag - and cause of missing reports for a certain schedule minute. The subscription processes reports sequentially, one row from your DDS recordset at a time. Again, this takes some time. You can also see that in the subscription window when it says: # of # processed.
I suggest you look at the Event table in the database ReportServer during an execution. Also the ExecutionHistory views (there are 3) may be interesting. A scheduled run shows up as a RequestType = 1 and generates one record for each report. You can see the exact timing and parameters of each report that is run in the subscription. You may be able to extract the data you need to resolve your other issues.
EDIT: Here is a more elaborate guide to DDS data and events
http://blogs.msdn.com/b/deanka/archive/2009/01/13/diagnosing-and-troubleshooting-subscriptions.aspx
http://blogs.msdn.com/b/deanka/archive/2010/02/16/troubleshooting-subscriptions-part-ii-using-the-report-services-trace-log-file.aspx
Could this "Double-Hop" problem be the source of my issues? I'm so stuck on this one!
The Double-Hop Problem - MSDN Knowledgecast

Determining query's progress (Oracle PL/SQL)

I am a developer on a web app that uses an Oracle database. However, often the UI will trigger database operations that take a while to process. As a result, the client would like a progress bar when these situations occur.
I recently discovered that I can query V$SESSION_LONGOPS from a second connection, and this is great, but it only works on operations that take longer than 6 seconds. This means that I can't update the progress bar in the UI until 6 seconds has passed.
I've done research on wait times in V$SESSION but as far as I've seen, that doesn't include the waiting for the query.
Is there a way to get the progress of the currently running query of a session? Or should I just hide the progress bar until 6 seconds has passed?
Are these operations Pl/SQL calls or just long-running SQL?
With PL/SQL operations we can write messages with SET_SESSION_LONGOPS() in the DBMS_APPLICATION_INFO package. We can monitor these messages in V$SESSION_LONGOPS. Find out more.
For this to work you need to be able to quantify the operation in units of work. These must be iterations of something concrete, and numeric not time. So if the operation is insert 10000 rows you could split that up into 10 batches. The totalwork parameter is the number of batches (i.e. 10) and you call SET_SESSION_LONGOPS() after every 1000 rows to increment the sofar parameter. This will allow you to render a thermometer of ten blocks.
These messages are session-based but there's no automatic way of distinguishing the current message from previous messages from the same session & SID. However if you assign a UID to the context parameter you can then use that value to filter the view.
This won't work for a single long running query, because there's no way for us to divide it into chunks.
i found this very usefull
dbms_session.set_module("MY Program" , "Kicking off ... ")
..
dbms_session.set_action("Extracting data ... ")
..
dbms_session.set_action("Transforming data ... ")
..
you can monitor the progress using
select module , action from v$session where sid = :yoursessionid
I've done quite a lot of web development with Oracle over the years and found that most users prefer an indeterminate progress bar, than a determinate bar that is inaccurate (a la pretty much any of Microsoft's progress bars which annoy me no end), and unfortunately there is no infallible way of accurately determining query progress.
Whilst your research into the long ops capability is admirable and would definitely help to make the progress of the database query more reliable, it can't take into account the myriad of other variables that may/will affect the web operation's transactional progress (network load, database load, application server load, client-side data parsing, the user clicking on a submit button 1,000 times, etc and so on).
I'd stick to the indeterminate progress method using Javascript callbacks. It's much easier to implement and it will manage your user's expectations as appropriate.
Using V$_SESSION_LONGOPS requires to set TIMED_STATISTICS=true or SQL_TRACE=true. Your database schema must be granted the ALTER SESSION system privilege to do so.
I once tried using V$_SESSION_LONGOPS with a complex and long running query. But it turned up that V$_SESSION_LONGOPS may show the progress of parts of the query like full table scans, join operations, and the like.
See also: http://www.dba-oracle.com/t_v_dollar_session_longops.htm
What you can do is just to show the user "the query is still running". I implemented a <DIV> nested into a <TD> that gets longer with every status request sent by the browser. Status requests are initiated by window.SetTimeout (every 3 seconds) and are AJAX calls to a server-side procedure. The status report returned by the server-side procedure simply says "we are still running". The progress bar's width (i.e. the <DIV>'s width) increments by 5% of the <TD>s width every time and is reset to 5% after showing 100%.
For long running queries you might track the time they took in a separate table, possibly with individual entries for varying where clauses. You could use this to display the average time plus the time that just elapsed in the client-side dialog.
If you have a long running PL/SQL procedure or the like on the server side doing several steps, try this:
create a table for status messages
use a unique key for any process the user starts. Suggestion: client side's javascript date in milliseconds + session ID.
in case the long running procedure is to be started by a link in a browser window, create a job using DBMS_JOB.SUBMIT to run the procedure instead of running the procedure directly
write a short procedure that updates the status table, using PRAGMA AUTONOMOUS_TRANSACTION. This pragma allows you to commit updates to the status table without committing your main procedure's updates. Each major step of your main procedure should have an entry of its own in this status table.
write a procedure to query the status table to be called by the browser
write a procedure that is called by an AJAX call if the use clicks "Cancel" or closes the window
write a procedure that is called by the main procedure after completion of each step: it queries the status table and raises an exception with an number in the 20,000s if the cancel flag was set or the browser did not query the status for, say, 60 seconds. In the main procedure's exception handler look for this error, do a rollback, and update the status table.