How can I collect SQL commands of an application on server level?

How can I collect SQL commands of an application on server level? - sql

What is the technics if I want to catch/monitor/log/save the native SQL commands of the application developed by us? We have Oracle database.
I have already tried the SQL Developer/Tools/Monitor session function, but it does not include the SQL statements of our apps. RealTime SQL Monitor function contains only a part of the required commands and a lot of useless entries….
Practically what I want:
- „Switch On” the trace function (e.g. in SQL developer or SQL*Plus)
- Launch the application and try some functionalities with real data (e.g. the slow queries)
- As soon as I think I have enough measurement: „Switch Off” the trace function and….
- Start analyzing/tuning the SQL commands (e.g. with SQL developer/Explain Plan, etc.)

1) You can always use awr reports for specified time to know which queries were running in the databases in given time period.
Run awr report using ;
#$ORACLE_HOME/rdbms/admin/awrrpt.sql
AWR captured top sql, but you can increase number of sql captured in awr.
2) Database level trace can capture all sqls run, with execution plans & other stats. You can manually start & stop trace. Once trace is stopped, you can use tkprof to generate readable files. trace always cause bit of performance over head & space overhead , but not massive in my experience.
Abhi

Related

Finding the slow stored procedures from a Trace file

Client from their production environment send me a trace file.
I want to know which stored procedures are taking longest.
The things they record in the trace file include: RPC:Starting , RPC:Completed
I noticed in in the trace columns we have both StartTime/EndTime and also Duration.
Which one should I use for my purpose?
And to know how long a SP took, so Should I find the difference between StartTime of RPC:Starting and EndTime of RPC:Completed?

If it helps, you can run the Stored Procedure using the Display Estimated Execution Plain (CTRL + L) tool contained in SQL Management Studio.
It will demonstrate the cost of time and execution of each.
I hope I have helped.

Talend's tOracleInput does not read data

My colleague created a project in Talend to read data from Oracle database.
I used his project and so I have his Job context with connection parameters to Oracle DB and Talend successfully connects on that computer.
I've created a trivial job which is composed of two components: tOracleInput which should be reading data and tLogRow which should be redirecting output to Talend's terminal.
The problem is that when I start the job - data is not outputted to terminal and instead of row amount outputted per second I see Starting ... status.
Would it be connection issues, inappropriate java version on my computer or something else?

Starting... status means that the query is being executed. Usually it takes a few seconds to execute a simple query against the database. This is because of the Oracle database behavior that it starts to return the data without completing a full table scan. To use this feature you can use joins and filters, but not group by / order by.
On the other hand if you're using a view or executing a complex query, or just simply use DISTINCT it could happen that the query execution takes a few minutes. This is because the oracle database generates the ResultSet on the database side before returning the records.

Retrieving billions of rows from remote server?

I am trying to retrieve around 200 billion rows from a remote SQL Server. To optimize this, I have limited my query to use only an indexed column as a filter and am selecting only a subset of columns to make the query look like this:
SELECT ColA, ColB, ColC FROM <Database> WHERE RecordDate BETWEEN '' AND ''
But it looks like unless I limit my query to a time window of a few hours, the query fails in all cases with the following error:
OLE DB provider "SQLNCLI10" for linked server "<>" returned message "Query timeout expired".
Msg 7399, Level 16, State 1, Server M<, Line 1
The OLE DB provider "SQLNCLI10" for linked server "<>" reported an error. Execution terminated by the provider because a resource limit was reached.
Msg 7421, Level 16, State 2, Server <>, Line 1
Cannot fetch the rowset from OLE DB provider "SQLNCLI10" for linked server "<>".
The timeout is probably an issue because of the time it takes to execute the query plan. As I do not have control over the server, I was wondering if there is a good way of retrieving this data beyond the simple SELECT I am using. Are there any SQL Server specific tricks that I can use? Perhaps tell the remote server to paginate the data instead of issuing multiple queries or something else? Any suggestions on how I could improve this?

This is more of the kind of job SSIS is suited for. Even a simple flow like ReadFromOleDbSource->WriteToOleDbSource would handle this, creating the necessary batching for you.

Why read 200 Billion rows all at once?
You should page them, reading say a few thousand rows at a time.
Even if you do genuinely need to read all 200 Billion rows you should still consider using paging to break up the read into shorter queries - that way if a failure happens you just continue reading where you left off.
See efficient way to implement paging for at least one method of implementing paging using ROW_NUMBER
If you are doing data analysis then I suspect you are either using the wrong storage (SQL Server isn't really designed for processing of large data sets), or you need to alter your queries so that the analysis is done on the Server using SQL.
Update: I think the last paragraph was somewhat misinterpreted.
Storage in SQL Server is primarily designed for online transaction processing (OLTP) - efficient querying of massive datasets in massively concurrent environments (for example reading / updating a single customer record in a database of billions, at the same time that thousands of other users are doing the same for other records). Typically the goal is to minimise the amout of data read, reducing the amount of IO needed and also reducing contention.
The analysis you are talking about is almost the exact opposite of this - a single client actively trying to read pretty much all records in order to perform some statistical analysis.
Yes SQL Server will manage this, but you have to bear in mind that it is optimised for a completely different scenario. For example data is read from disk a page (8 KB) at a time, despite the fact that your statistical processing is probably only based on 2 or 3 columns. Depending on row density and column width you may only be using a tiny fraction of the data stored on an 8 KB page - most of the data that SQL Server had to read and allocate memory for wasn't even used. (Remember that SQL Server also had to lock that page to prevent other users from messing with the data while it was being read).
If you are serious about processing / analysis of massive datasets then there are storage formats that are optimised for exactly this sort of thing - SQL Server also has an add on service called Microsoft Analysis Services that adds additional online analytical processing (OLAP) and data mining capabilities, using storage modes more suited to this sort of processing.

Personally I would use a data extraction tool such as BCP to get the data to a local file before trying to manipulate it if I was trying to pull that much data at once.
http://msdn.microsoft.com/en-us/library/ms162802.aspx

This isn't A SQL Server specific answer, but even when the rDBMS supports server side cursors, it's considered poor form to use them. Doing so means that you are consuming resources on the server even though the server is still waiting for you to request more data.
Instead you should reformulate your query usage so that the server can transmit the entire result set as soon as it can, and then completely forget about you and your query to make way for the next one. When the result set is too large for you process all in one go, you should keep track of the last row returned by the current batch so that you can fetch another batch starting at that position.

Odds are the remote server has the "Remote Query Timeout" set. How long does it take for the query to fail?

Just run into the same problem, I also had the message at 10:01 after running the query.
Check this link. There's a remote query timeout setting under Connections that's setup to 600secs by default and you need to change it to zero (unlimited) or other value you think is right.

Try to change remote server connection timeout property.
For that go to SSMS, connect to the server, right click on server's name in object explorer, further select Properties -> Connections and change value in the Remote query timeout (in seconds, 0 = no timeout) text box.

DBMS Query Log, is it possible?

We want to record all db query into log table, is it possible?

For sql server, see this answer about sql server profiler. For mysql, the query log is a solution. However, they both write to files but you can always parse the log files and insert them into tables if you want to query the data.
Beware, however that logging does not come free. You will see some performance degradation in both cases. If you only want to log the queries of an application, you could opt to log the queries there (optionally, asynchronously). You'll have to test to see what's the best option.
EDIT : And also, depending your amount of traffic, logging all queries can eat large amounts of diskspace in a short amount of time. If you log in the application, you could use an logging library like nlog that has a rollover system (i.e. if the logfiles reach > 100 mb, then start deleting the oldest files). In all three cases, you could also set aside a partition meant only for logging so it doesn't fill up your main hard disks.

From a SQL Server perspective......
As others have suggested, SQL Server Profiler is certainly one way to go but you're going to incur a resource hit from doing so. Should you choose this method you absolutely must implement it as a Server Side Trace rather than via the GUI.
You may also have some success monitoring, recording the contents of the Dynamic Management Views (DMV) for things such as query execution statistics.
You'll want to look at DMV's such as:
sys.dm_exec_query_stats
sys.dm_exec_sql_text
sys.dm_exec_query_plan
For example, here is a query that can be used to identify the poorest performing top 20 SQL queries by CPU consumption. Not exactly what you are after but it does demonstrate how to use the DMV's that you would be interested in.
SELECT TOP 20
qs.sql_handle,
qs.execution_count,
qs.total_worker_time AS Total_CPU,
total_CPU_inSeconds = --Converted from microseconds
qs.total_worker_time/1000000,
average_CPU_inSeconds = --Converted from microseconds
(qs.total_worker_time/1000000) / qs.execution_count,
qs.total_elapsed_time,
total_elapsed_time_inSeconds = --Converted from microseconds
qs.total_elapsed_time/1000000,
st.text,
qp.query_plan
FROM
sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS apply sys.dm_exec_query_plan (qs.plan_handle) AS qp
ORDER BY qs.total_worker_time DESC

I don't think you can write to a DB table, MYSQL can write them to a file and you can write a script to parse the file and insert the queries.

What is your FIRST SQL command to run to troubleshoot SQL Server performance?

When the SQL Server (2000/2005/2008) is running sluggish, what is the first command that you run to see where the problem is?
The purpose of this question is that, when all answers are compiled, other users can benefit by running your command of choice to segregate where the problem might be.
There are other troubleshooting posts regarding SQL Server performance but they can be useful only for specific cases.
If you roll out and run your own custom SQL script,
then would you let others know what
the purpose of the script is
it returns (return value)
to do to figure out where problem is
If you could provide source for the script, please post it.
In my case,
sp_lock
I run to figure out if there are any locks (purpose) to return SQL server lock information. Since result set displays object IDs (thus not so human readable), I would usually skim through result to see if there are abnormally many locks.
Feel free to update tags

Why run a single query when a picture is worth a thousand words!
I prefer to run the freely avaialable Performance Dashboard Reports.
They provide a complete snapshot overview of your servers performance in seconds. You can then choose the a specific area to investigate (locking, currently running queries, wait requests etc.) simply by clicking the apporpriate area on the Dashboard.
http://www.microsoft.com/downloads/details.aspx?FamilyId=1d3a4a0d-7e0c-4730-8204-e419218c1efc&displaylang=en
One slight caveat, I beleive these are only available in SQL 2005 and above.

sp_who
http://msdn.microsoft.com/en-us/library/aa260384(SQL.80).aspx
I want to see "who", what machines/users are running what queries, length of time, etc. I can also easily scan for blocks.
If something is blocking a bunch of other transactions I can use the spid to issue a kill command if necessary.

sp_who_3 - Provides a lot of information available elsewhere but in one nice output. Also has several parameters to allow customized output.

A custom query which combines what you would expect in sp_who with DBCC INPUTBUFFER(spid) to get the last query text on each spid ordered by the blocked/blocking graph.
Process data is avaliable via master..sysprocesses.

sp_who3 returns standand sp_who2 output, until you specify a specific spid, then gives 6 different recordsets about that spid including locks, blocks, what it's currently doing, the T/SQL it's running, and the statement within the T/SQL that is currently running.

Ian Stirk has a great script I like to use as detailed in this article: http://msdn2.microsoft.com/en-ca/magazine/cc135978.aspx
In particular, I like the missing indexes one:
SELECT
DatabaseName = DB_NAME(database_id)
,[Number Indexes Missing] = count(*)
FROM sys.dm_db_missing_index_details
GROUP BY DB_NAME(database_id)
ORDER BY 2 DESC;

DBCC OPENTRAN to see what the oldest active transaction is
Displays information about the oldest
active transaction and the oldest
distributed and nondistributed
replicated transactions, if any,
within the specified database. Results
are displayed only if there is an
active transaction or if the database
contains replication information. An
informational message is displayed if
there are no active transactions.
followed by sp_who2

I use queries like those:
Number of open/active connections in ms sql server 2005

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas