xperf - CPU cycle consumed by Process for given pair of ETW events - xperf

Scenario:
1. Started Xperf.
App is started. There are ETW events (E1, E2) in App.
While running App, ETW(E1,E2) events are raised, the (T1,T2) are timestamps corresponding to (E1, E2) events.
Stop Xperf. ETL file is generated.
The (T1,T2) timestamps are obtained from ETL file.
Query:
How to get the CPU cycle consumed by App between timestamps (T1,T2) using xperf?

Open the ETL in WPA.exe, drag and drop the CPU Usage (Sampling) and Generic Events to the analysis pane.
Select the time between Event 1 and 2 and click "Zoom". Now go to the CPU Sampling graph and filter the graph to only include your application. Now you see the number of Counts and the % of the CPU usage of your app during the duration between the 2 events.

Related

Splunk query using time an event occurs in one index and using it as a starting point to filter events in another index

What's the most efficient way to perform the following search?
Event occurs on index A at X time
Take X time and use it as a start point in index B
Search all occurrences of a field within index B, with additional filters, 5 minutes after that initial event time that occurred from index A
Example using Windows logs: after every successful login via event ID 4624 (index="security") for a particular user on a host, search all Sysmon event ID 1 (index="sysmon") process creation events on that specific host that occurred in a 5 minute window after the login event. My vision is to examine user logins on a particular host and correlate subsequent process creation events over a short period of time.
I've been trying to play with join, stats min(_time), and eval starttimeu, but haven't had any success. Any help/pointers would be greatly appreciated!
Have you tried map? The map command runs a search for each result of another search. For example:
index=security sourcetype=wineventlog EventCode=4624
```Set the latest time for the map to event time + 5 minutes (300 seconds)```
| eval latest=_time+300
| map search="search index=sysmon host=$host$ earliest=$_time$ latest=$latest$"
Names within $ are field names from the main search.

report scheduler system design using database as master

Problem
we have ~50k scheduled financial reports that we periodically deliver to clients via email
reports have their own delivery frequency (date&time format - as configured by clients)
weekly
daily
hourly
weekdays only
etc.
Current architecture
we have a table called report_metadata that holds report information
report_id
report_name
report_type
report_details
next_run_time
last_run_time
etc...
every week, all 6 instances of our scheduler service poll the report_metadata database, extract metadata for all reports that are to be delivered in the following week, and puts them in a timed-queue in-memory.
Only in the master/leader instance (which is one of the 6 instances):
data in the timed-queue is popped at the appropriate time
processed
a few API calls are made to get a fully-complete and current/up-to-date report
and the report is emailed to clients
the other 5 instances do nothing - they simply exist for redundancy
Proposed architecture
Numbers:
db can handle up to 1000 concurrent connections - which is good enough
total existing report number (~50k) is unlikely to get much larger in the near/distant future
Solution:
instead of polling the report_metadata db every week and storing data in a timed-queue in-memory, all 6 instances will poll the report_metadata db every 60 seconds (with a 10 s offset for each instance)
on average the scheduler will attempt to pick up work every 10 seconds
data for any single report whose next_run_time is in the past is extracted, the table row is locked, and the report is processed/delivered to clients by that specific instance
after the report is successfully processed, table row is unlocked and the next_run_time, last_run_time, etc for the report is updated
In general, the database serves as the master, individual instances of the process can work independently and the database ensures they do not overlap.
It would help if you could let me know if the proposed architecture is:
a good/correct solution
which table columns can/should be indexed
any other considerations
I have worked on a differt kind of sceduler for a program that reported analyses on a specific moment of the month/week and what I did was combining the reports to so called business cycle based time moments. these moments are on the "start of a new week", "start of the month", "start/end of a D/W/M/Q/Y'. So I standardised the moments of sending the reports and added the id's to a table that would carry the details of the report. - now you add thinks to the cycle of you remove it when needed, you could do this by adding a tag like(EOD(end of day)/EOM (End of month) SOW (Start of week) ect, ect, ect,).
So you could index the moments of when the clients want to receive the reports and build on that track. Hope that this comment can help you with your challenge.
It seems good to simply query that metadata table by all 6 instances to check which is the next report to process as you are suggesting.
It seems odd though to have a staggered approach with a check once every 60 seconds offset by 10 seconds for your servers. You have 6 servers now but that may change. Also I don't understand the "locking" you are suggesting, why now simply set a flag on the row such as [State] = "processing", then the next scheduler knows to skip that row and move on to the next available one. Once a run is processed, you can simply update a [Date_last_processed] column, or maybe something like [last_cycle_complete] = 'YES'.
Alternatively you could have one server-process to go through the table, and for each available row, sends it off to one of the instances, in a round-robin fashion (or keep track of who is busy and who isn't).

Simultaneously run of simulation and optimization in Anylogic

In “Anylogic” I created a production chain of a furniture company for five product types. The chain was built of “delay-”, “service-” with “RessourcePool-Blocks” to simulate the production time of the product and the availability of the worker on the machining station.
The goal was to adjust the load profile with the energy production profile of a photovoltaic system. For this purpose, the production time should be choose individual. The production profile become loaded in the simulation from an excel table. Every day the company produce 20 furniture of different types.
The idea to adjust the production time and the kind of furniture (every furniture type have a different production time) was to use “parameter” to determine the start point to begin the production and the selection of product type. I need for each product two parameter.
In an “optimization” “experiment” in “Anylogic” the simulation for one day runs through. The “Objective” is to minimize the amount of energy draw from the grid. After 500 iterations and some adjustments of the value range, I get adapted parameter as result. The resulted value for the parameter in the “optimization” can transfer to the production chain simulation.
My question is:
Is it possible to transfer the resulted parameters automatic to the production chain program, so that I do not need always to run both simulations separately?
I want to simulate for a whole year, to get specific result for different production profile over the year and not start the optimization program and the simulation individual for every day.

Background Threads in Horizontally Scaled Application

The current situation is that I have an application that scales horizontally with one SQL database. Periodically, a background process is ran but I only want one invocation of this background process running at a time. I have tried to accomplish this by using a database row and locking but I am stuck. The requirement is that only one batch job should have successfully completed per day.
Currently I have a table called lock which has three columns: timestamp, lock_id, status. Status is an enum that has three values 0 = not running, 1 = running, 2 = completed.
The issue is that if a batch job fails and status is equal to 0, How can I make sure that only one background process will retry. How do I guarantee that only one background process is running in the retry scenario?
In an ideal world, I would like to do a SELECT statement that checks for the STATUS in the locking table, if status is = 0 meaning not running then start the background job and change status to 1 = running. However, if all horizontally scaled processes do this at the same time, is it guaranteed that only one is executed?
Thanks!

Determining query's progress (Oracle PL/SQL)

I am a developer on a web app that uses an Oracle database. However, often the UI will trigger database operations that take a while to process. As a result, the client would like a progress bar when these situations occur.
I recently discovered that I can query V$SESSION_LONGOPS from a second connection, and this is great, but it only works on operations that take longer than 6 seconds. This means that I can't update the progress bar in the UI until 6 seconds has passed.
I've done research on wait times in V$SESSION but as far as I've seen, that doesn't include the waiting for the query.
Is there a way to get the progress of the currently running query of a session? Or should I just hide the progress bar until 6 seconds has passed?
Are these operations Pl/SQL calls or just long-running SQL?
With PL/SQL operations we can write messages with SET_SESSION_LONGOPS() in the DBMS_APPLICATION_INFO package. We can monitor these messages in V$SESSION_LONGOPS. Find out more.
For this to work you need to be able to quantify the operation in units of work. These must be iterations of something concrete, and numeric not time. So if the operation is insert 10000 rows you could split that up into 10 batches. The totalwork parameter is the number of batches (i.e. 10) and you call SET_SESSION_LONGOPS() after every 1000 rows to increment the sofar parameter. This will allow you to render a thermometer of ten blocks.
These messages are session-based but there's no automatic way of distinguishing the current message from previous messages from the same session & SID. However if you assign a UID to the context parameter you can then use that value to filter the view.
This won't work for a single long running query, because there's no way for us to divide it into chunks.
i found this very usefull
dbms_session.set_module("MY Program" , "Kicking off ... ")
..
dbms_session.set_action("Extracting data ... ")
..
dbms_session.set_action("Transforming data ... ")
..
you can monitor the progress using
select module , action from v$session where sid = :yoursessionid
I've done quite a lot of web development with Oracle over the years and found that most users prefer an indeterminate progress bar, than a determinate bar that is inaccurate (a la pretty much any of Microsoft's progress bars which annoy me no end), and unfortunately there is no infallible way of accurately determining query progress.
Whilst your research into the long ops capability is admirable and would definitely help to make the progress of the database query more reliable, it can't take into account the myriad of other variables that may/will affect the web operation's transactional progress (network load, database load, application server load, client-side data parsing, the user clicking on a submit button 1,000 times, etc and so on).
I'd stick to the indeterminate progress method using Javascript callbacks. It's much easier to implement and it will manage your user's expectations as appropriate.
Using V$_SESSION_LONGOPS requires to set TIMED_STATISTICS=true or SQL_TRACE=true. Your database schema must be granted the ALTER SESSION system privilege to do so.
I once tried using V$_SESSION_LONGOPS with a complex and long running query. But it turned up that V$_SESSION_LONGOPS may show the progress of parts of the query like full table scans, join operations, and the like.
See also: http://www.dba-oracle.com/t_v_dollar_session_longops.htm
What you can do is just to show the user "the query is still running". I implemented a <DIV> nested into a <TD> that gets longer with every status request sent by the browser. Status requests are initiated by window.SetTimeout (every 3 seconds) and are AJAX calls to a server-side procedure. The status report returned by the server-side procedure simply says "we are still running". The progress bar's width (i.e. the <DIV>'s width) increments by 5% of the <TD>s width every time and is reset to 5% after showing 100%.
For long running queries you might track the time they took in a separate table, possibly with individual entries for varying where clauses. You could use this to display the average time plus the time that just elapsed in the client-side dialog.
If you have a long running PL/SQL procedure or the like on the server side doing several steps, try this:
create a table for status messages
use a unique key for any process the user starts. Suggestion: client side's javascript date in milliseconds + session ID.
in case the long running procedure is to be started by a link in a browser window, create a job using DBMS_JOB.SUBMIT to run the procedure instead of running the procedure directly
write a short procedure that updates the status table, using PRAGMA AUTONOMOUS_TRANSACTION. This pragma allows you to commit updates to the status table without committing your main procedure's updates. Each major step of your main procedure should have an entry of its own in this status table.
write a procedure to query the status table to be called by the browser
write a procedure that is called by an AJAX call if the use clicks "Cancel" or closes the window
write a procedure that is called by the main procedure after completion of each step: it queries the status table and raises an exception with an number in the 20,000s if the cancel flag was set or the browser did not query the status for, say, 60 seconds. In the main procedure's exception handler look for this error, do a rollback, and update the status table.