SQL Server choice wrong execution plan

SQL Server choice wrong execution plan - sql

When this query is executed, SQL Server chooses a wrong execution plan, why?
SELECT top 10 AccountNumber , AVERAGE
FROM [M].[dbo].[Account]
WHERE [Code] = 9201
Go
SELECT top 10 AccountNumber , AVERAGE
FROM [M].[dbo].[Account] with (index(IX_Account))
WHERE [Code] = 9201
SQL Server chooses the clustered PK index for this query and elapsed time = 78254 ms, but if I force SQL Server to choose a non-clustered index then elapsed time is 2 ms, Stats Account table is updated.

It's usually down to having bad statistics on the various indexes. Even with correct stats, an index can only hold so many samples and occasionally when there is a massive skew in the values then the optimiser can think that it won't find a sufficiently small number.
Also you can sometimes have a massive amount of [almost] empty blocks to read through with data values only at "the end". This can sometimes mean where you have a couple of otherwise close variations, one will require drastically more IO to burn through the holes. Likewise if you don't actually have 10 values for 9201 it will have to do an entire table scan if it choses the PK/CI rather than a more fitting index. This is more prevalent when you've done plenty of deletes.
Try updating the stats on the various indexes and things like that & see if it changes anything. 78 seconds is a lot of IO on a single table scan.

Related

Despite the existence of relevant indices, PostgreSQL query is slow

I have a table with 3 columns:
time (timestamptz)
price (numeric(8,2))
set_id (int)
The table contains 7.4M records.
I've created an simple index for time and an index for set_id.
I want to run the following query:
select * from test_prices where time BETWEEN '2015-06-05 00:00:00+00' and '2020-06-05 00:00:00+00';
Depsite my indices, the query takes 2 minutes and 30 seconds.
See explain analze stats: https://explain.depesz.com/s/ZwSH
GCP postgres DB has the following stats:
What do I miss here? Why is this query so slow and how can I improve?

According to your explain plan, the row is returning 1.6 million rows out of 4.5 million. That means that a significant portion of rows are being returned.
Postgres wisely decides that a full table scan is more efficient than using an index, because there is a good chance that all the data pages will need to be read anyway.
It is surprising that you are reporting 00:02:30 for the query. The explain is saying that the query completes in about 1.4 seconds -- which seems reasonable.
I suspect that the elapsed time is caused by the volume of data being returned (perhaps the rows are very wide), a slow network connection to the database, or contention on the database/server.

Your query selects two thirds of the table. A sequential scan is the most efficient way to process such a query.
Your query executes in under 2 seconds. It must be your client that takes a long time to render the query result (pgAdmin is infamous for that). Use a different client.

Simple select query running forever

A simple SQL Select is query running forever for a particular ID in SQL Server 2012.
This query is running forever; it should return 10000 rows:
select *
from employees
where company_id = 34
If I change the query to
select *
from employees
where company_id = 12
it returns 7000 rows very quickly.
Employees is a view created by joining different tables.
Could there be a problem in the view?

One possibility is that you have a very large table. Such a query is probably scanning the entire tables and returning rows that match as they are encountered.
My guess is that rows for company 12 are encountered before rows for company 34.
If this is the case, then an index on (company_id) should help.
There may be other causes as well. Here are two other possibilities:
Contention for rows with company_id 34 that are causing delays on reading the data (this would depend on isolation level that you are using and the nature of concurrent updates).
An unlimited size column which is populated with very big values for company_id 34 and empty or very small for 12.
There may be other possibilities as well.

One of the things you can do to speed up the process is to index the column on company_id as a b-tree index would speed up the search.

Without looking at the structure of the table and execution plan, here are a few things that can be suggested apart from what Gordon has already covered:
Could you create indexes on the underlying tables which can cover this query? That would include index on the 'searched' and 'sorted' columns (joins, where clause, order by, group by, distinct) and include the SELECTED columns in the INCLUDE part of the indexes (in case of a nonclustered rowstore index)? Aim is to see 'index seek' in the Execution Plan.
Update statistics on the underlying tables. (And as a side note, would suggest to keep 'AUTO CREATE' and 'AUTO UPDATE' statistics ON unless you have a reason not to do that automatically in your application)
Would also like to know when was the last time defragmentation was performed on the server. Long due defragmentation could be a very good reason for why you might see this kind of issues on certain values, specially on a table on which lot of write operations happen.
Execute the query again. Even if you do not have proper information about #3 above, you can try to execute the query skipping step#3.
While running the query check for waits stats in in the server by
querying at dmvs: sys.dm_os_wait_stats and sys.dm_tran_locks. Please
check whether the wait is due to CXPACKET (waits due to other parallel
processes) or PAGEIOLATCH (Reading from Disk than RAM) or locks. It
is the starting point of the investigation which will give you the
root cause and you can then take appropriate measure accordingly.
Additional quick check can be: checking 'Available RAM' in the
server task manager. Please make sure that your SQL Server RAM is not used up by
some other unnecessary applications/sessions.

Executing same simple select statement or stored procedure on SQL Azure takes long time or times-out

I have two SQL Server Azure instances with Standard S2: 50 DTUs. When I run simple select statements on two instances, one of them takes more time than other or times out. Slower one have more records in tables in slower instance.
Both the instances have same table schema. Number of records in tables present in slower instances, LogEvidence table have 1324928 and LogItem table have 649391. Number of records in tables present in faster instances, LogEvidence table have 89504 and LogItem table have 89496.
Below is the simple select statement
select count(*) from logitem
Above simple select statement takes 0s on faster instance and on slower instance it takes 138s. And if I execute any stored procedure, slower instance takes more times or times out.
Time taken by both instances should be almost same.

Those simple queries perform big scans on the table and involve reading all rows. If the table has a clustered index you don't have to perform a SELECT COUNT(*) to know the number of records the table has. The following query should to that faster:
SELECT OBJECT_NAME(ps.object_id) , i.name , row_count
FROM sys.dm_db_partition_stats AS ps INNER JOIN sys.indexes AS i
ON ps.index_id = i.index_id AND ps.object_id = i.object_id
WHERE i.name like '%logitem%'
If the table does not have an Id please add an autoid on the table and make it the clustered index.
You can also try to add a useless WHERE clause like below to the query, and you may get a better performance.
SELECT count(*)
FROM logitem
WHERE id > 0
Where Id is the autoid column.

I had some experience with azure, and from your description I think there is one of following things you can do:
Since you are using only count, then indexes play no role. Though I understand other answer says to use where id>0, but azure should count 1M rows without 30 second timeout. But for other queries you need Indexes, or Azure will fail.
Check if your server is not under maintenance, it is low chance but it does happen with us, we are on s4 and occasionally our server just get slow, but after 10-30 minute it works fine. Maybe the actual hardware get in some process that slows it down.
This is most important reason for slow execution, especially if you have lot of write and delete happen on your server. Check the database size. Azure database got fragmented too quickly, we have to optimize it's data fragmentation every 10 days, if your bacpac size is 100MB and your database size in Azure shows like 5-6 GB, then it definitely need optimization as lot of fragments were generated. MSDN has given some queries to recreate indexes and remove fragmentation, I don't remember them URL, but simple google search will bring that. It should speed things up.
Azure has feature that auto generate indexes, check if both table share same indexes, maybe your faster version has some index Azure create by itself.

You should step back and ponder your assumption:
1. "performance should be about the same" - you have more data in one case vs. the other. In the limit, you should expect the performance of the second one to potentially be somewhat slower than the original one.
Now, let's go into the "why" it can be slower and how you can investigate each case:
Step 1: Look at the query plans for each case and see what you have. Likely, you will have something like:
StreamAgg <- Clustered Index Scan
(if you have other b-tree indexes, you might scan one of them and it might be faster since the index would not be as wide and thus the index will have fewer pages to scan)
Step 2: You can look at the actual execution times and resource use for each query to see why they are different. One way to do this is to run "set statistics time on", then "set statistics io on", then run your query. it will dump out extra information into SSMS when you run the query from there. (You can read about this here: https://learn.microsoft.com/en-us/sql/t-sql/statements/set-statistics-io-transact-sql?view=sql-server-2017)
If you review the output from each one, you may find reasons for why the performance is different. One possible explanation is that the amount of memory is limited in an S2 and you are just at the boundary for where all the pages fit in memory vs. not for the two examples. In that case, doing a count(*) query would need to cycle through all the pages and do much more IO than in the smaller case where they might all be in memory already.
Step 3: You can also potentially examine the query store to get insight into why one case is fast and one case is not. An overview of how to use it is here:
https://learn.microsoft.com/en-us/sql/relational-databases/performance/monitoring-performance-by-using-the-query-store?view=sql-server-2017
Note: it is on-by-default in SQL Azure so you can just go look at the time window when you ran the queries to get insight into what was happening at that time in your database.
Finally, you might consider ways to make the query go faster if you need it to be faster.
* creating a narrow b-tree index on the table may help for that one query (count(*) doesn't return any columns and just needs a count of rows from some non-filtered index).
* you could use a Columnstore (which requires an S3 or above for memory reasons). This kind of column-oriented index is optimized for this kind of query and would be much faster as the size of the table increases in the future.
Hope that help

SQL Server 2005 : intermittently slow Inserts

I have a client app that is submitting the following command to SQL Server 2005. At a specific time of day we are having performance issues where some of the requests are taking between 2 - 8 seconds to run when the norm is below 300ms. We are researching SQL Server options as well as all external variables that can impact the server.
My question here is how/why can a request take 8 seconds and during this time many other identical requests start and finish during this 8 second window? What can be preventing the 8 second call from finishing, but not prevent or slow down the other calls?
Running server profiler during this time the number of reads are around 20 and the writes less than 5 for all (long and short durations) the calls.
The table being inserted into has around 22M records. We are keeping about 30 days worth of data. We will probably change the approach to archive this data daily and keep the daily insert table small and index free, but really want to understand what is happening here.
There are no triggers on this table.
There are 3 indexes for GUID, Time and WebServerName (none are clustered)
Here's the command being submitted:
exec sp_executesql N'Insert Into WebSvcLog_Duration (guid,time,webservername,systemidentity,useridentity,metricname,details,duration,eventtype)values(#guid,#time,#webservername,#systemidentity,#useridentity,#metricname,#details,#duration,#eventtype)',N'#guid nvarchar(36),#time datetime,#webservername nvarchar(10),#systemidentity nvarchar(10),#useridentity nvarchar(8),#metricname nvarchar(5),#details nvarchar(101),#duration float,#eventtype int',#guid=N'...',#time='...',#webservername=N'...',#systemidentity=N'...',#useridentity=N'...',#metricname=N'...',#details=N'...',#duration=0.0,#eventtype=1

The probable reason why is heap fragmentation; you didn't mention if you had some sort of index maintenance going on, so I'm assuming that it's non-existent. The best way to minimize fragmentation is to build a clustered index on a monotonic value (a column with a naturally increasing order). I'm not sure what the time column is supposed to represent, but if it's the time of insertion, then it might be a good candidate for a clustered index; if not, then I'd add a column that captures the time inserted into the table and build a clustered index on that.

Count(*) vs Count(1) - SQL Server

Just wondering if any of you people use Count(1) over Count(*) and if there is a noticeable difference in performance or if this is just a legacy habit that has been brought forward from days gone past?
The specific database is SQL Server 2005.

There is no difference.
Reason:
Books on-line says "COUNT ( { [ [ ALL | DISTINCT ] expression ] | * } )"
"1" is a non-null expression: so it's the same as COUNT(*).
The optimizer recognizes it for what it is: trivial.
The same as EXISTS (SELECT * ... or EXISTS (SELECT 1 ...
Example:
SELECT COUNT(1) FROM dbo.tab800krows
SELECT COUNT(1),FKID FROM dbo.tab800krows GROUP BY FKID
SELECT COUNT(*) FROM dbo.tab800krows
SELECT COUNT(*),FKID FROM dbo.tab800krows GROUP BY FKID
Same IO, same plan, the works
Edit, Aug 2011
Similar question on DBA.SE.
Edit, Dec 2011
COUNT(*) is mentioned specifically in ANSI-92 (look for "Scalar expressions 125")
Case:
a) If COUNT(*) is specified, then the result is the cardinality of T.
That is, the ANSI standard recognizes it as bleeding obvious what you mean. COUNT(1) has been optimized out by RDBMS vendors because of this superstition. Otherwise it would be evaluated as per ANSI
b) Otherwise, let TX be the single-column table that is the
result of applying the <value expression> to each row of T
and eliminating null values. If one or more null values are
eliminated, then a completion condition is raised: warning-

In SQL Server, these statements yield the same plans.
Contrary to the popular opinion, in Oracle they do too.
SYS_GUID() in Oracle is quite computation intensive function.
In my test database, t_even is a table with 1,000,000 rows
This query:
SELECT COUNT(SYS_GUID())
FROM t_even
runs for 48 seconds, since the function needs to evaluate each SYS_GUID() returned to make sure it's not a NULL.
However, this query:
SELECT COUNT(*)
FROM (
SELECT SYS_GUID()
FROM t_even
)
runs for but 2 seconds, since it doen't even try to evaluate SYS_GUID() (despite * being argument to COUNT(*))

I work on the SQL Server team and I can hopefully clarify a few points in this thread (I had not seen it previously, so I am sorry the engineering team has not done so previously).
First, there is no semantic difference between select count(1) from table vs. select count(*) from table. They return the same results in all cases (and it is a bug if not). As noted in the other answers, select count(column) from table is semantically different and does not always return the same results as count(*).
Second, with respect to performance, there are two aspects that would matter in SQL Server (and SQL Azure): compilation-time work and execution-time work. The Compilation time work is a trivially small amount of extra work in the current implementation. There is an expansion of the * to all columns in some cases followed by a reduction back to 1 column being output due to how some of the internal operations work in binding and optimization. I doubt it would show up in any measurable test, and it would likely get lost in the noise of all the other things that happen under the covers (such as auto-stats, xevent sessions, query store overhead, triggers, etc.). It is maybe a few thousand extra CPU instructions. So, count(1) does a tiny bit less work during compilation (which will usually happen once and the plan is cached across multiple subsequent executions). For execution time, assuming the plans are the same there should be no measurable difference. (One of the earlier examples shows a difference - it is most likely due to other factors on the machine if the plan is the same).
As to how the plan can potentially be different. These are extremely unlikely to happen, but it is potentially possible in the architecture of the current optimizer. SQL Server's optimizer works as a search program (think: computer program playing chess searching through various alternatives for different parts of the query and costing out the alternatives to find the cheapest plan in reasonable time). This search has a few limits on how it operates to keep query compilation finishing in reasonable time. For queries beyond the most trivial, there are phases of the search and they deal with tranches of queries based on how costly the optimizer thinks the query is to potentially execute. There are 3 main search phases, and each phase can run more aggressive(expensive) heuristics trying to find a cheaper plan than any prior solution. Ultimately, there is a decision process at the end of each phase that tries to determine whether it should return the plan it found so far or should it keep searching. This process uses the total time taken so far vs. the estimated cost of the best plan found so far. So, on different machines with different speeds of CPUs it is possible (albeit rare) to get different plans due to timing out in an earlier phase with a plan vs. continuing into the next search phase. There are also a few similar scenarios related to timing out of the last phase and potentially running out of memory on very, very expensive queries that consume all the memory on the machine (not usually a problem on 64-bit but it was a larger concern back on 32-bit servers). Ultimately, if you get a different plan the performance at runtime would differ. I don't think it is remotely likely that the difference in compilation time would EVER lead to any of these conditions happening.
Net-net: Please use whichever of the two you want as none of this matters in any practical form. (There are far, far larger factors that impact performance in SQL beyond this topic, honestly).
I hope this helps. I did write a book chapter about how the optimizer works but I don't know if its appropriate to post it here (as I get tiny royalties from it still I believe). So, instead of posting that I'll post a link to a talk I gave at SQLBits in the UK about how the optimizer works at a high level so you can see the different main phases of the search in a bit more detail if you want to learn about that. Here's the video link: https://sqlbits.com/Sessions/Event6/inside_the_sql_server_query_optimizer

Clearly, COUNT(*) and COUNT(1) will always return the same result. Therefore, if one were slower than the other it would effectively be due to an optimiser bug. Since both forms are used very frequently in queries, it would make no sense for a DBMS to allow such a bug to remain unfixed. Hence you will find that the performance of both forms is (probably) identical in all major SQL DBMSs.

In the SQL-92 Standard, COUNT(*) specifically means "the cardinality of the table expression" (could be a base table, `VIEW, derived table, CTE, etc).
I guess the idea was that COUNT(*) is easy to parse. Using any other expression requires the parser to ensure it doesn't reference any columns (COUNT('a') where a is a literal and COUNT(a) where a is a column can yield different results).
In the same vein, COUNT(*) can be easily picked out by a human coder familiar with the SQL Standards, a useful skill when working with more than one vendor's SQL offering.
Also, in the special case SELECT COUNT(*) FROM MyPersistedTable;, the thinking is the DBMS is likely to hold statistics for the cardinality of the table.
Therefore, because COUNT(1) and COUNT(*) are semantically equivalent, I use COUNT(*).

COUNT(*) and COUNT(1) are same in case of result and performance.

I would expect the optimiser to ensure there is no real difference outside weird edge cases.
As with anything, the only real way to tell is to measure your specific cases.
That said, I've always used COUNT(*).

As this question comes up again and again, here is one more answer. I hope to add something for beginners wondering about "best practice" here.
SELECT COUNT(*) FROM something counts records which is an easy task.
SELECT COUNT(1) FROM something retrieves a 1 per record and than counts the 1s that are not null, which is essentially counting records, only more complicated.
Having said this: Good dbms notice that the second statement will result in the same count as the first statement and re-interprete it accordingly, as not to do unnecessary work. So usually both statements will result in the same execution plan and take the same amount of time.
However from the point of readability you should use the first statement. You want to count records, so count records, not expressions. Use COUNT(expression) only when you want to count non-null occurences of something.

I ran a quick test on SQL Server 2012 on an 8 GB RAM hyper-v box. You can see the results for yourself. I was not running any other windowed application apart from SQL Server Management Studio while running these tests.
My table schema:
CREATE TABLE [dbo].[employee](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[Name] [nvarchar](50) NOT NULL,
CONSTRAINT [PK_employee] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
Total number of records in Employee table: 178090131 (~ 178 million rows)
First Query:
Set Statistics Time On
Go
Select Count(*) From Employee
Go
Set Statistics Time Off
Go
Result of First Query:
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 35 ms.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 10766 ms, elapsed time = 70265 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
Second Query:
Set Statistics Time On
Go
Select Count(1) From Employee
Go
Set Statistics Time Off
Go
Result of Second Query:
SQL Server parse and compile time:
CPU time = 14 ms, elapsed time = 14 ms.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 11031 ms, elapsed time = 70182 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
You can notice there is a difference of 83 (= 70265 - 70182) milliseconds which can easily be attributed to exact system condition at the time queries are run. Also I did a single run, so this difference will become more accurate if I do several runs and do some averaging. If for such a huge data-set the difference is coming less than 100 milliseconds, then we can easily conclude that the two queries do not have any performance difference exhibited by the SQL Server Engine.
Note : RAM hits close to 100% usage in both the runs. I restarted SQL Server service before starting both the runs.

SET STATISTICS TIME ON
select count(1) from MyTable (nolock) -- table containing 1 million records.
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 36 ms.
select count(*) from MyTable (nolock) -- table containing 1 million records.
SQL Server Execution Times:
CPU time = 46 ms, elapsed time = 37 ms.
I've ran this hundreds of times, clearing cache every time.. The results vary from time to time as server load varies, but almost always count(*) has higher cpu time.

There is an article showing that the COUNT(1) on Oracle is just an alias to COUNT(*), with a proof about that.
I will quote some parts:
There is a part of the database software that is called “The
Optimizer”, which is defined in the official documentation as
“Built-in database software that determines the most efficient way to
execute a SQL statement“.
One of the components of the optimizer is called “the transformer”,
whose role is to determine whether it is advantageous to rewrite the
original SQL statement into a semantically equivalent SQL statement
that could be more efficient.
Would you like to see what the optimizer does when you write a query
using COUNT(1)?
With a user with ALTER SESSION privilege, you can put a tracefile_identifier, enable the optimizer tracing and run the COUNT(1) select, like: SELECT /* test-1 */ COUNT(1) FROM employees;.
After that, you need to localize the trace files, what can be done with SELECT VALUE FROM V$DIAG_INFO WHERE NAME = 'Diag Trace';. Later on the file, you will find:
SELECT COUNT(*) “COUNT(1)” FROM “COURSE”.”EMPLOYEES” “EMPLOYEES”
As you can see, it's just an alias for COUNT(*).
Another important comment: the COUNT(*) was really faster two decades ago on Oracle, before Oracle 7.3:
Count(1) has been rewritten in count(*) since 7.3 because Oracle like
to Auto-tune mythic statements. In earlier Oracle7, oracle had to
evaluate (1) for each row, as a function, before DETERMINISTIC and
NON-DETERMINISTIC exist.
So two decades ago, count(*) was faster
For another databases as Sql Server, it should be researched individually for each one.
I know that this question is specific for SQL Server, but the other questions on SO about the same subject (without mention a specific database) were closed and marked as duplicated from this answer.

In all RDBMS, the two ways of counting are equivalent in terms of what result they produce. Regarding performance, I have not observed any performance difference in SQL Server, but it may be worth pointing out that some RDBMS, e.g. PostgreSQL 11, have less optimal implementations for COUNT(1) as they check for the argument expression's nullability as can be seen in this post.
I've found a 10% performance difference for 1M rows when running:
-- Faster
SELECT COUNT(*) FROM t;
-- 10% slower
SELECT COUNT(1) FROM t;

COUNT(1) is not substantially different from COUNT(*), if at all. As to the question of COUNTing NULLable COLUMNs, this can be straightforward to demo the differences between COUNT(*) and COUNT(<some col>)--
USE tempdb;
GO
IF OBJECT_ID( N'dbo.Blitzen', N'U') IS NOT NULL DROP TABLE dbo.Blitzen;
GO
CREATE TABLE dbo.Blitzen (ID INT NULL, Somelala CHAR(1) NULL);
INSERT dbo.Blitzen SELECT 1, 'A';
INSERT dbo.Blitzen SELECT NULL, NULL;
INSERT dbo.Blitzen SELECT NULL, 'A';
INSERT dbo.Blitzen SELECT 1, NULL;
SELECT COUNT(*), COUNT(1), COUNT(ID), COUNT(Somelala) FROM dbo.Blitzen;
GO
DROP TABLE dbo.Blitzen;
GO

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas