We want to record all db query into log table, is it possible?
For sql server, see this answer about sql server profiler. For mysql, the query log is a solution. However, they both write to files but you can always parse the log files and insert them into tables if you want to query the data.
Beware, however that logging does not come free. You will see some performance degradation in both cases. If you only want to log the queries of an application, you could opt to log the queries there (optionally, asynchronously). You'll have to test to see what's the best option.
EDIT : And also, depending your amount of traffic, logging all queries can eat large amounts of diskspace in a short amount of time. If you log in the application, you could use an logging library like nlog that has a rollover system (i.e. if the logfiles reach > 100 mb, then start deleting the oldest files). In all three cases, you could also set aside a partition meant only for logging so it doesn't fill up your main hard disks.
From a SQL Server perspective......
As others have suggested, SQL Server Profiler is certainly one way to go but you're going to incur a resource hit from doing so. Should you choose this method you absolutely must implement it as a Server Side Trace rather than via the GUI.
You may also have some success monitoring, recording the contents of the Dynamic Management Views (DMV) for things such as query execution statistics.
You'll want to look at DMV's such as:
sys.dm_exec_query_stats
sys.dm_exec_sql_text
sys.dm_exec_query_plan
For example, here is a query that can be used to identify the poorest performing top 20 SQL queries by CPU consumption. Not exactly what you are after but it does demonstrate how to use the DMV's that you would be interested in.
SELECT TOP 20
qs.sql_handle,
qs.execution_count,
qs.total_worker_time AS Total_CPU,
total_CPU_inSeconds = --Converted from microseconds
qs.total_worker_time/1000000,
average_CPU_inSeconds = --Converted from microseconds
(qs.total_worker_time/1000000) / qs.execution_count,
qs.total_elapsed_time,
total_elapsed_time_inSeconds = --Converted from microseconds
qs.total_elapsed_time/1000000,
st.text,
qp.query_plan
FROM
sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS apply sys.dm_exec_query_plan (qs.plan_handle) AS qp
ORDER BY qs.total_worker_time DESC
I don't think you can write to a DB table, MYSQL can write them to a file and you can write a script to parse the file and insert the queries.
Related
I'm looking for a query to get the current running queries in Azure SQL. All of the T-SQL I've found do not show the running queries when I test them (for instance, run a query in one window, then look in another window at the running queries). Also, I'm not looking for anything related to the time, CPU, etc, but only the actual running query text.
When I run ...
SELECT * FROM Table --(takes 2 minutes to load)
... and run a standard information query (like from Pinal Dave or this), I don't see the above query (I assume there's another way).
select * from sys.dm_exec_requests should give you what other sessions are doing.You can join this with sys.dm_exec_sql_text to get the text if needed. sys.dm_tran_locks gives the locks hold / waiting. If this is V12 server you can also use dbcc inutbuffer. Make sure that the connection you are running is dbo / server admin
Continuing from this post - I have some more questions I hope someone can help me with:
Is there a way to select a specific database and/or table to get the queries from or add this as a column?
In my queries there are some variables shown as #P1 or #GUID. Is there a way to get the data that was inserted there?
I am only using Express to I also have no access to SQL Profiler.
sys.dm_exec_sql_text has a dbid column, so you can filter on that. For example I took the query from the other answer and added a where clause filtering on queries against master:
SELECT deqs.last_execution_time AS [Time], dest.TEXT AS [Query]
FROM sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
WHERE dest.dbid = DB_ID('master')
ORDER BY deqs.last_execution_time DESC
Note that not all queries have the right database context (or a database context at all). For example, if you have a query that joins two tables that are in different databases, you will only see one dbid - it will either be the executing context and may or may not be one of the databases referenced in the query. So applying the filter might actually hide queries you are interested in.
You may be able to obtain parameters by digging into the XML from other DMOs such as sys.dm_exec_cached_plans and sys.dm_exec_query_plan. If you have an execution plan already for a query that you've captured, it is going to be much easier to use a tool like SQL Sentry Plan Explorer than to wade through gobs of XML yourself.
Disclaimer: I work for SQL Sentry, who provides the free tool to the community.
Just an FYI, you know that even though SQL Express doesn't include profile, if you have access to it you can use.
[Cross posted from the Database Administrators site, in the hope that it may gain better traction here. I'll update either site as appropriate.]
I have a stored procedure in SQL Server 2005 (SP2) which contains a single query like the following (simplified for clarity)
SELECT * FROM OPENQUERY(MYODBC, 'SELECT * FROM MyTable WHERE Option = ''Y'' ')
OPTION (MAXDOP 1)
When this proc is run (and run only once) I can see the plan appear in sys.dm_exec_query_stats with a high 'total_worker_time' value (eg. 34762.196 ms). This is close to the elapsed time. However, in SQL Management Studio the statistics show a much lower CPU time, as I'd expect (eg. 6828 ms). The query takes some time to return, because of the slowness of the server it is talking to, but it doesn't return many rows.
I'm aware of the issue that parallel queries in SQL Server 2005 can present odd CPU times, which is why I've tried to turn off any parallism with the query hint (though I really don't think that there was any in any case).
I don't know how to account for the fact that the two ways of looking at CPU usage can differ, nor which of the two might be accurate (I have other reasons for thinking that the CPU usage may be the higher number, but it's tricky to measure). Can anyone give me a steer?
UPDATE: I was assuming that the problem was with the OPENQUERY so I tried looking at times for a long-running query which doesn't use OPENQUERY. In this case the statistics (gained by setting STATISTICS TIME ON) reported the CPU time at 3315ms, whereas the DMV gave it at 0.511ms. The total elapsed times in each case agreed.
total_worker_time in sys.dm_exec_query_stats is cumulative - it is the total execution time for all the executions of the currently compiled version of the query - see execution_count for the number of executions this represents.
See last_worker_time, min_worker_time or max_worker_time for timing of individual executions.
reference: http://msdn.microsoft.com/en-us/library/ms189741.aspx
I am trying to retrieve around 200 billion rows from a remote SQL Server. To optimize this, I have limited my query to use only an indexed column as a filter and am selecting only a subset of columns to make the query look like this:
SELECT ColA, ColB, ColC FROM <Database> WHERE RecordDate BETWEEN '' AND ''
But it looks like unless I limit my query to a time window of a few hours, the query fails in all cases with the following error:
OLE DB provider "SQLNCLI10" for linked server "<>" returned message "Query timeout expired".
Msg 7399, Level 16, State 1, Server M<, Line 1
The OLE DB provider "SQLNCLI10" for linked server "<>" reported an error. Execution terminated by the provider because a resource limit was reached.
Msg 7421, Level 16, State 2, Server <>, Line 1
Cannot fetch the rowset from OLE DB provider "SQLNCLI10" for linked server "<>".
The timeout is probably an issue because of the time it takes to execute the query plan. As I do not have control over the server, I was wondering if there is a good way of retrieving this data beyond the simple SELECT I am using. Are there any SQL Server specific tricks that I can use? Perhaps tell the remote server to paginate the data instead of issuing multiple queries or something else? Any suggestions on how I could improve this?
This is more of the kind of job SSIS is suited for. Even a simple flow like ReadFromOleDbSource->WriteToOleDbSource would handle this, creating the necessary batching for you.
Why read 200 Billion rows all at once?
You should page them, reading say a few thousand rows at a time.
Even if you do genuinely need to read all 200 Billion rows you should still consider using paging to break up the read into shorter queries - that way if a failure happens you just continue reading where you left off.
See efficient way to implement paging for at least one method of implementing paging using ROW_NUMBER
If you are doing data analysis then I suspect you are either using the wrong storage (SQL Server isn't really designed for processing of large data sets), or you need to alter your queries so that the analysis is done on the Server using SQL.
Update: I think the last paragraph was somewhat misinterpreted.
Storage in SQL Server is primarily designed for online transaction processing (OLTP) - efficient querying of massive datasets in massively concurrent environments (for example reading / updating a single customer record in a database of billions, at the same time that thousands of other users are doing the same for other records). Typically the goal is to minimise the amout of data read, reducing the amount of IO needed and also reducing contention.
The analysis you are talking about is almost the exact opposite of this - a single client actively trying to read pretty much all records in order to perform some statistical analysis.
Yes SQL Server will manage this, but you have to bear in mind that it is optimised for a completely different scenario. For example data is read from disk a page (8 KB) at a time, despite the fact that your statistical processing is probably only based on 2 or 3 columns. Depending on row density and column width you may only be using a tiny fraction of the data stored on an 8 KB page - most of the data that SQL Server had to read and allocate memory for wasn't even used. (Remember that SQL Server also had to lock that page to prevent other users from messing with the data while it was being read).
If you are serious about processing / analysis of massive datasets then there are storage formats that are optimised for exactly this sort of thing - SQL Server also has an add on service called Microsoft Analysis Services that adds additional online analytical processing (OLAP) and data mining capabilities, using storage modes more suited to this sort of processing.
Personally I would use a data extraction tool such as BCP to get the data to a local file before trying to manipulate it if I was trying to pull that much data at once.
http://msdn.microsoft.com/en-us/library/ms162802.aspx
This isn't A SQL Server specific answer, but even when the rDBMS supports server side cursors, it's considered poor form to use them. Doing so means that you are consuming resources on the server even though the server is still waiting for you to request more data.
Instead you should reformulate your query usage so that the server can transmit the entire result set as soon as it can, and then completely forget about you and your query to make way for the next one. When the result set is too large for you process all in one go, you should keep track of the last row returned by the current batch so that you can fetch another batch starting at that position.
Odds are the remote server has the "Remote Query Timeout" set. How long does it take for the query to fail?
Just run into the same problem, I also had the message at 10:01 after running the query.
Check this link. There's a remote query timeout setting under Connections that's setup to 600secs by default and you need to change it to zero (unlimited) or other value you think is right.
Try to change remote server connection timeout property.
For that go to SSMS, connect to the server, right click on server's name in object explorer, further select Properties -> Connections and change value in the Remote query timeout (in seconds, 0 = no timeout) text box.
When the SQL Server (2000/2005/2008) is running sluggish, what is the first command that you run to see where the problem is?
The purpose of this question is that, when all answers are compiled, other users can benefit by running your command of choice to segregate where the problem might be.
There are other troubleshooting posts regarding SQL Server performance but they can be useful only for specific cases.
If you roll out and run your own custom SQL script,
then would you let others know what
the purpose of the script is
it returns (return value)
to do to figure out where problem is
If you could provide source for the script, please post it.
In my case,
sp_lock
I run to figure out if there are any locks (purpose) to return SQL server lock information. Since result set displays object IDs (thus not so human readable), I would usually skim through result to see if there are abnormally many locks.
Feel free to update tags
Why run a single query when a picture is worth a thousand words!
I prefer to run the freely avaialable Performance Dashboard Reports.
They provide a complete snapshot overview of your servers performance in seconds. You can then choose the a specific area to investigate (locking, currently running queries, wait requests etc.) simply by clicking the apporpriate area on the Dashboard.
http://www.microsoft.com/downloads/details.aspx?FamilyId=1d3a4a0d-7e0c-4730-8204-e419218c1efc&displaylang=en
One slight caveat, I beleive these are only available in SQL 2005 and above.
sp_who
http://msdn.microsoft.com/en-us/library/aa260384(SQL.80).aspx
I want to see "who", what machines/users are running what queries, length of time, etc. I can also easily scan for blocks.
If something is blocking a bunch of other transactions I can use the spid to issue a kill command if necessary.
sp_who_3 - Provides a lot of information available elsewhere but in one nice output. Also has several parameters to allow customized output.
A custom query which combines what you would expect in sp_who with DBCC INPUTBUFFER(spid) to get the last query text on each spid ordered by the blocked/blocking graph.
Process data is avaliable via master..sysprocesses.
sp_who3 returns standand sp_who2 output, until you specify a specific spid, then gives 6 different recordsets about that spid including locks, blocks, what it's currently doing, the T/SQL it's running, and the statement within the T/SQL that is currently running.
Ian Stirk has a great script I like to use as detailed in this article: http://msdn2.microsoft.com/en-ca/magazine/cc135978.aspx
In particular, I like the missing indexes one:
SELECT
DatabaseName = DB_NAME(database_id)
,[Number Indexes Missing] = count(*)
FROM sys.dm_db_missing_index_details
GROUP BY DB_NAME(database_id)
ORDER BY 2 DESC;
DBCC OPENTRAN to see what the oldest active transaction is
Displays information about the oldest
active transaction and the oldest
distributed and nondistributed
replicated transactions, if any,
within the specified database. Results
are displayed only if there is an
active transaction or if the database
contains replication information. An
informational message is displayed if
there are no active transactions.
followed by sp_who2
I use queries like those:
Number of open/active connections in ms sql server 2005