SQL Server execution plans show an "Estimated CPU Cost" in the operator properties and tooltips as in the following example
Does this "Estimated CPU Cost" represent the estimated "CPU usage %" (as shown in tools like task manager or perfmon?)
Does the "Estimated CPU Cost" shown in the execution plan represent
the estimated CPU usage %
No this number is a unitless number that is generated by the cost model. When added to the IO Cost and other fixed per operator cost this gets an overall plan cost that originally (last millennium) was somewhat correlated with execution time in seconds on a certain Microsoft employees machine
These days it can not be correlated with any specific estimate of CPU utilisation as a percent or in elapsed time and is only intended to be used by the optimiser itself when costing plans and comparing plan costs.
You can see that this has no real world correlation as the CPU costs for the same operator will be largely the same across computers running the same version of SQL Server irrespective of the model of CPU that they contain (and many of the formulas have remained much the same across multiple product versions since SQL Server 2000)
E.g. Joe Chang calculated that a Clustered Index Scan, Table Scan, Index Seek Will be given a CPU Cost of 0.0001581 + 0.0000011 per row.
You might also be interested in reading Inside the Optimizer: Plan Costing
Related
SQL Server Query with 6M records taking 8 Sec is it Normal ?
If no then how i can optimize the query to reduce execution time
select ChargeID , SUM(Fee) from Charges group by ChargeID
The Server Machine is an Xeon(R) CPU with 12GB of RAM and running 64-bit OS
Memory usage is nearly 10GB and CPU ussage is 5-10 %
Charges Table have only Clustered Index ChargeID.
Here is Execution Plan
Recommend some tips or tricks that can reduce the execution time Thanks
Yes and no. It depends on the server, likely more the disc IO.
YOu do an index seek - that is as good as it gets. THe question is how fast teh discs deliver the data. I would expect a lot less time, but then I would expect the "discs" to be SSD in 2014 for any real analysis.
I would check disc IO, latencies etc. - but from the SQL side there is nothing you can do anymore, that is as good as it gets as a query plan.
The query remains constant i.e it will remain the same.
e.g. a select query takes 30 minutes if it returns 10000 rows.
Would the same query take 1 hour if it has to return 20000 rows?
I am interested in knowing the mathematical relation between no. of rows(N) and execution time(T) keeping other parameters as constant(K).
i.e T= N*K or
T=N*K + C or
any other formula?
Reading http://research.microsoft.com/pubs/76556/progress.pdf if it helps. Anybody who can understand this before me, please do reply. Thanks...
Well that is good question :), but there is not exact formula, because it depends of execution plan.
SQL query optimizer could choose another execution plan on query which return different number of rows.
I guess if the query execution plan is the same for both query's and you have some "lab" conditions then time growth could be linear. You should research more on sql execution plans and statistics
Take the very simple example of reading every row in a single table.
In the worst case, you will have to read every page of the table from your underlying storage. The worst case for this is having to do a random seek. The seek time will dominate all other factors. So you can estimate the total time.
time ~= seek time x number of data pages
Assuming your rows are of a fairly regular size, then this is linear in the number of rows.
However databases do a number of things to try and avoid this worst case. For example, in SQL Server table storage is often allocated in extents of 8 consecutive pages. A hard drive has a much faster streaming IO rate than random IO rate. If you have a clustered index, reading the pages in cluster order tend to have a lot more streaming IO than random IO.
The best case time, ignoring memory caching, is (8KB is the SQL Server page size)
time ~= 8KB * number of data pages / streaming IO rate in KB/s
This is also linear in the number of rows.
As long as you do a reasonable job managing fragmentation, you could reasonably extrapolate linearly in this simple case. This assumes your data is much larger than the buffer cache. If not, you also have to worry about the cliff edge where your query changes from reading from buffer to reading from disk.
I'm also ignoring details like parallel storage paths and access.
I have read references of cost of SQL statement everywhere in databases.
What exactly does it mean? So this is the number of statements to be executed or something?
Cost is rough measure of how much CPU time and disk I/O server must spend to execute the query.
Typically cost is split into few components:
Cost to parse the query - it can be non-trivial just to understand what you are asking for.
Cost to get data from disk and access indexes if it reduces the cost.
Cost to compute some mathematical operations on your data.
Cost to group or sort data.
Cost to deliver data back to client.
The cost is an arbitrary number that is higher if the CPU time/IO/Memory is higher. It is also vendor specific.
It means how much it will "cost" you to run a specific SQL query in terms of CPU, IO, etc.
For example Query A can cost you 1.2sec and Query B can cost you 1.8sec
See here:
Measuring Query Performance : "Execution Plan Query Cost" vs "Time Taken"
Cost of a query relates to how much CPU utilization and time will query take. This is an estimated value, your query might take less or more time depending on the data. If your tables and all the indexes of that table are analyzed(analyze table table_name compute statistics for table for all indexes for all indexed columns) then cost based result's estimate will suffice with your query execution time.
Theoretically the above answers can satisfy you.But when you are on the floor working here is an insight.
Practically you can access the cost by the number of seeks and scans your SQL query takes.
Go to Execution Plan and accordingly you will be able to optimize the amount of time (roughly the cost) your query takes.
A sample execution plan looks like this :
It means the query performance. How much optimized the query is...
I am trying to increase one of my request performance.
My request is made of 10 different select .
The actual production query is taking 36sec to execute.
If I display the execution plan, for one select I have a query cost of 18%.
So I change a in clause (in this select) with an xml query (http://www.codeproject.com/KB/database/InClauseAndSQLServer.aspx).
The new query now takes 28 sec to execute, but sql server tells me that the above select has a query cost of 100%. And this is the only change I made. And there is no parallelism in any query.
PRODUCTION :
36sec, my select is 18% (the others are 10%).
NEW VERSION :
28sec, my select is 100% (the others are 0%).
Do you have any idea how sql server compute this "query cost" ? (I start to believe that it's random or something like that).
Query cost is a unitless measure of a combination of CPU cycles, memory, and disk IO.
Very often you will see operators or plans with a higher cost but faster execution time.
Primarily this is due to the difference in speed of the above three components. CPU and Memory are fairly quick, and also uncommon as bottlenecks. If you can shift some pressure from the disk IO subsystem to the CPU, the query may show a higher cost but should execute substantially faster.
If you want to get more detailed information about the execution of your specific queries, you can use:
SET STATISTICS IO ON
SET STATISTICS TIME ON
This will output detailed information about CPU cycles, plan creation, and page reads (both from disk and from memory) to the messages tab.
When Oracle is estimating the 'Cost' for certain queries, does it actually look at the amount of data (rows) in a table?
For example:
If I'm doing a full table scan of employees for name='Bob', does it estimate the cost by counting the amount of existing rows, or is it always a set cost?
The cost optimizer uses segment (table and index) statistics as well as system (cpu + i/o performance) statistics for the estimates. Although it depends on how your database is configured, from 10g onwards the segment statistics are usually computed once per day by a process that is calling the DBMS_STATS package.
In the default configuration, Oracle will check the table statistics (which you can look at by querying the ALL_TABLES view - see the column NUM_ROWS). Normally an Oracle job is run periodically to re-gather these statistics by querying part or all of the table.
If the statistics haven't been gathered (yet), the optimizer will (depending on the optimizer_dynamic_sampling parameter) run a quick sample query on the table in order to calculate an estimate for the number of rows in that table.
(To be more accurate, the cost of scanning a table is calculated not from the number of rows, but the number of blocks in the table (which you can see in the BLOCKS column in ALL_TABLES). It takes this number and divides it by a factor related to the multi-block read count to calculate the cost of that part of the plan.)