SQL Server 2012 query blocked with LCK_M_IS - sql

I'm struggling to understand how the following two queries could be blocking each other.
Running query (could be almost anything though):
insert bulk [Import].[WorkTable] ...
I'm trying to run the following SELECT query at the same time:
SELECT *
FROM ( SELECT * FROM #indexPart ip
JOIN sys.indexes i (NOLOCK)
ON i.object_id = ip.ObjectId
and i.name = ip.IndexName) i
CROSS
APPLY sys.dm_db_index_physical_Stats(db_id(), i.object_id,i.index_id,NULL,'LIMITED') ps
WHERE i.is_disabled = 0
The second query is blocked by the first query and shows a LCK_M_IS as wait info. Import information is that the temporary table #indexPart contains one record of an index on a completely different table. My expectation is that the cross apply tries to run the stats on that one index which has nothing to do with the other query running.
Thanks
EDIT (NEW):
After several more tests I think I found the culprit but again can't explain it.
Bulk Insert Session has an X lock on table [Import].[WorkTable]
The query above is checking for an Index on table [Import].[AnyOtherTable] BUT is requesting an IS lock on [Import].[WorkTable]. I've verified again and again that the query above (when running the stuff without cross apply) is only returning an index on table [Import].[AnyOtherTable].
Now here comes the magic, changing the CROSS APPLY to an OUTER APPLY runs through just fine without any locking issues.
I hope someone can explain this to me ...

The problem could be at the where clause you used. It should be within the inline table. The following change could make a difference.
FROM ( SELECT * FROM #indexPart ip
JOIN sys.indexes i (NOLOCK)
ON i.object_id = ip.ObjectId
and i.name = ip.IndexName
WHERE i.is_disabled = 0) i
If you do like so, this may reduce the number of records passed onto the cross apply statement.

Related

Oracle Select Query Takes Too Much When Not In Clause Used

ORACLE VERSION : 19C
I am working on a legacy select query which returns around 60k rows. It İS formed of 9 joins and 2 unions. I want to exclude a small number of audience if they are inside the case i specified.
I wrote a select query using four joins and then used not in clause to exclude these audience.
The query was executing in aroung 15seconds before but after i wrote this not in clause it did not finish even in 20 minutes and i aborted it.
It is coded like this;
A.ID NOT IN (SELECT A.ID
FROM A
INNER JOIN B
ON A.X = BX
INNER JOIN C
ON B.Y = C.Y
INNER JOIN D
ON C.Z = D.Z)
However if i execute this subquery before the select and insert it into a table and then use not in clause for the table it almost finishes in 15 seconds just as normal
It is coded like this;
A.ID NOT IN (SELECT GT.ID FROM GENERATED_TABLE GT)
Do you know why it takes too much time when it is not populated into a table?
And are there any way to make the first one run faster?
Expecting it to take much less time
Try to use EXISTS instead of IN statement. The EXISTS clause is much faster than IN when the subquery results is very large.
And again - check EXPLAIN PLAN and search for FULL SCAN keywords. That will be the main cause.

SQL Server 2012 Performance issue using FULLTEXT

I'm using SQL Server 2012 Standard and I have some issue using the CONTAINS clause on a query.
My query:
select *
from
calles as c
INNER JOIN
colonias as col ON c.ID_Colonia = col.ID_Colonia
where
CONTAINS(c.Nombre,#Busqueda) OR CONTAINS(col.Nombre,#Busqueda)
If I use only one contains the time of the search is about 200 ms but if I use both it is about 10s (that's a lot of time). I try a workaround to do it using UNION like this:
select *
from
calles as c
INNER JOIN
colonias as col ON c.ID_Colonia = col.ID_Colonia
where
CONTAINS(c.Nombre,#Busqueda)
UNION
select *
from
calles as c
INNER JOIN
colonias as col ON c.ID_Colonia = col.ID_Colonia
where
CONTAINS(col.Nombre,#Busqueda)
And the query time is about 200ms again. But I think that the second code is clumsy. Do I have some error?
FULLTEXT index in SQL Server is a service which is (kinda) external to the RDBMS engine.
It accepts the search string and returns a list of key values from the table (which need then to be joined with the table itself to be sure they're still there).
So in fact you are joining two more tables in your query and apply an OR condition to the result of the join.
SQL Server's optimizer is not especially smart when it comes to constructs like this.
Replacing an OR condition with a UNION is a legitimate and commonly used optimization technique.

SQL Server stored procedure takes 1' 18" to run... seems long

Sure could use some optimization help here. I've got a stored procedure which takes approximately 1 minute, 18 seconds to run and it gets even worse when I run the asp.net page which hits it.
Some stats:
tbl_Allocation typically has approximately 55K records
CS_Ready has ~300
Redate_Orders has ~2000
Here is the code:
ALTER PROCEDURE [dbo].[sp_Order_Display]
/*
(
#parameter1 int = 5,
#parameter2 datatype OUTPUT
)
*/
AS
/* SET NOCOUNT ON */
BEGIN
WTIH CS_Ready AS
(
SELECT
tbl_Order_Notes.txt_Order_Only As CS_Ready_Order
FROM
tbl_Order_Notes
INNER JOIN
tbl_Order_Notes_by_line ON tbl_Order_Notes.txt_Order_Only = SUBSTRING(tbl_Order_Notes_by_line.txt_Order_Key_by_line, 1, CHARINDEX('-', tbl_Order_Notes_by_line.txt_Order_Key_by_line, 0) - 1)
WHERE
(tbl_Order_Notes.bin_Customer_Service_Review = 'True')
AND (tbl_Order_Notes_by_line.dat_Recommended_Date_by_line IS NOT NULL)
AND (tbl_Order_Notes_by_line.bin_Redate_Request_by_line = 'True')
OR (tbl_Order_Notes.bin_Customer_Service_Review = 'True')
AND (tbl_Order_Notes_by_line.dat_Recommended_Date_by_line IS NULL)
AND (tbl_Order_Notes_by_line.bin_Redate_Request_by_line = 'False'
OR tbl_Order_Notes_by_line.bin_Redate_Request_by_line IS NULL)
),
Redate_Orders AS
(
SELECT DISTINCT
SUBSTRING(txt_Order_Key_by_line, 1, CHARINDEX('-', txt_Order_Key_by_line, 0) - 1) AS Redate_Order_Number
FROM
tbl_Order_Notes_by_line
WHERE
(bin_Redate_Request_by_line = 'True')
)
SELECT DISTINCT
tbl_Allocation.*, tbl_Order_Notes.*,
tbl_Order_Notes_by_line.*,
tbl_Max_Promised_Date_1.Max_Promised_Ship,
tbl_Max_Promised_Date_1.Max_Scheduled_Pick,
Redate_Orders.Redate_Order_Number, CS_Ready.CS_Ready_Order,
tbl_Most_Recent_Comments.Abbr_Comment,
MRC_Line.Abbr_Comment as Abbr_Comment_Line
FROM
tbl_Allocation
INNER JOIN
tbl_Max_Promised_Date AS tbl_Max_Promised_Date_1 ON tbl_Allocation.num_Order_Num = tbl_Max_Promised_Date_1.num_Order_Num
LEFT OUTER JOIN
CS_Ready ON tbl_Allocation.num_Order_Num = CS_Ready.CS_Ready_Order
LEFT OUTER JOIN
Redate_Orders ON tbl_Allocation.num_Order_Num = Redate_Orders.Redate_Order_Number
LEFT OUTER JOIN
tbl_Order_Notes ON Hidden_Order_Only = tbl_Order_Notes.txt_Order_Only
LEFT OUTER JOIN
tbl_Order_Notes_by_line ON Hidden_Order_Key = tbl_Order_Notes_by_line.txt_Order_Key_by_line
LEFT OUTER JOIN
tbl_Most_Recent_Comments ON Cast(tbl_Allocation.Hidden_Order_Only as varchar) = tbl_Most_Recent_Comments.Com_ID_Parent_Key
LEFT OUTER JOIN
tbl_Most_Recent_Comments as MRC_Line ON Cast(tbl_Allocation.Hidden_Order_Key as varchar) = MRC_Line.Com_ID_Parent_Key
ORDER BY
num_Order_Num, num_Line_Num
End
RETURN
What suggestions do you have to make this execute within five seconds or less?
Thanks,
Rob
Assuming you have appropriate indices defined, you still have several things that suggest problems.
1) You have 2 select distinct clauses in this query -- in a good design, distinct clauses are are rarely needed
2) The first inner join uses
tbl_Order_Notes_by_line
ON tbl_Order_Notes.txt_Order_Only
= SUBSTRING(tbl_Order_Notes_by_line.txt_Order_Key_by_line, 1,
CHARINDEX('-', tbl_Order_Notes_by_line.txt_Order_Key_by_line, 0) - 1)
This looks like a horrible join criteria -- function calls during the join that prevent any decent query optimization. My guess is that your are using data the has internal meaning and that you are parsing the internal meaning during the join, e.g.,
PartNumber = AAA-BBBB_NNNNNNNN
where AAA is the Country product line and BBBB is the year & month of the design
If you must have coded fields like these AND you need to manipulate them, put the codes into separate database fields and created a computer column -- or even a plan copy of the full part number field if the combined field is unusually complex.
This point is not a performance issue, but you have a long sub-query using multiple AND & OR clauses. I know the rules for operator precedence, you may know the rules for operator precedence, but will the next guy? Will you remember them an 1:00 when stuff is broken.
ADDED
You are using 2 common table expressions. I know others say it does not happen, but I don't really trust the query optimizer for CTE's -- I have had to recode CTE based joins for performance issues on several occasions -- creating an actual view equivalent to the CTE and using that instead can be a significant speedup. May well depend on the version of SQL server, but if you are running an older version I would definitely wonder about CTR optimization. -- This is not as important as the first 2 things I've mentioned, try to fix those first.
ADDED
I'm going to harsh on CTEs again, as I did not really explain why they are bad for performance, and it was bothering me. If you don't have performance issues, and you like the syntax, they can be useful in at least limited usage, personally I don't normally recommend them for anything more than that -- and given that it is MS specific syntactical sugar, I really can't recommend them much at all.
I think the primary reason that CTEs don't get optimized well is that there are no statistics for the opimizer to use. If you are pulling a lot of rows into a CTE, you are probably better off creating #temptable and populating it. You can even add an index or two to your #temptable and the optimizer can figure out how to use them too. A #temp table is similar, but at least through sql 2012, the were no faster than #temp that I could tell -- supposedly new goodness in server 2014 help this.
A CTE is really just a temporary view in disguise, which I why I suggested you can replace with a real view to better better performance (and you often can), or you can populate a temp table and sometime get even better performance.

SQL Server Count is slow

Counting tables with large amount of data may be very slow, sometimes it takes minutes; it also may generate deadlock on a busy server. I want to display real values, NOLOCK is not an option.
The servers I use is SQL Server 2005 or 2008 Standard or Enterprise - if it matters.
I can imagine that SQL Server maintains the counts for every table and if there is no WHERE clause I could get that number pretty quickly, right?
For example:
SELECT COUNT(*) FROM myTable
should immediately return with the correct value. Do I need to rely on statistics to be updated?
Very close approximate (ignoring any in-flight transactions) would be:
SELECT SUM(p.rows) FROM sys.partitions AS p
INNER JOIN sys.tables AS t
ON p.[object_id] = t.[object_id]
INNER JOIN sys.schemas AS s
ON s.[schema_id] = t.[schema_id]
WHERE t.name = N'myTable'
AND s.name = N'dbo'
AND p.index_id IN (0,1);
This will return much, much quicker than COUNT(*), and if your table is changing quickly enough, it's not really any less accurate - if your table has changed between when you started your COUNT (and locks were taken) and when it was returned (when locks were released and all the waiting write transactions were now allowed to write to the table), is it that much more valuable? I don't think so.
If you have some subset of the table you want to count (say, WHERE some_column IS NULL), you could create a filtered index on that column, and structure the where clause one way or the other, depending on whether it was the exception or the rule (so create the filtered index on the smaller set). So one of these two indexes:
CREATE INDEX IAmTheException ON dbo.table(some_column)
WHERE some_column IS NULL;
CREATE INDEX IAmTheRule ON dbo.table(some_column)
WHERE some_column IS NOT NULL;
Then you could get the count in a similar way using:
SELECT SUM(p.rows) FROM sys.partitions AS p
INNER JOIN sys.tables AS t
ON p.[object_id] = t.[object_id]
INNER JOIN sys.schemas AS s
ON s.[schema_id] = t.[schema_id]
INNER JOIN sys.indexes AS i
ON p.index_id = i.index_id
WHERE t.name = N'myTable'
AND s.name = N'dbo'
AND i.name = N'IAmTheException' -- or N'IAmTheRule'
AND p.index_id IN (0,1);
And if you want to know the opposite, you just subtract from the first query above.
(How large is "large amount of data"? - should have commented this first, but maybe the exec below helps you out already)
If I run a query on a static (means no one else is annoying with read/write/updates in quite a while so contention is not an issue) table with 200 million rows and COUNT(*) in 15 seconds on my dev machine (oracle).
Considering the pure amount of data, this is still quite fast (at least to me)
As you said NOLOCK is not an option, you could consider
exec sp_spaceused 'myTable'
as well.
But this pins down nearly to the same as NOLOCK (ignoring contention + delete/update afaik)
I've been working with SSMS for well over a decade and only in the past year found out that it can give you this information quickly and easily, thanks to this answer.
Select the "Tables" folder from the database tree (Object Explorer)
Press F7 or select View > Object Explorer Details to open Object Explorer Details view
In this view you can right-click on the column header to select the columns you want to see including table space used, index space used and row count:
Note that the support for this in Azure SQL databases seems a bit spotty at best - my guess is that the queries from SSMS are timing out, so it only returns a handful of tables each refresh, however the highlighted one always seems to be returned.
Count will do either a table scan or an index scan. So for a high number of rows it will be slow. If you do this operation frequently, the best way is to keep the count record in another table.
If however you do not want to do that, you can create a dummy index (that will not be used by your query's) and query it's number of items, something like:
select
row_count
from sys.dm_db_partition_stats as p
inner join sys.indexes as i
on p.index_id = i.index_id
and p.object_id = i.object_id
where i.name = 'your index'
I am suggesting creating a new index, because this one (if it will not be used) will not get locked during other operations.
As Aaron Bertrand said, maintaining the query might be more costly then using an already existing one. So the choice is yours.
If you just need a rough count of number of rows, ie. to make sure a table loaded properly or to make sure the data was not deleted, do the following:
MySQL> connect information_schema;
MySQL> select table_name,table_rows from tables;

Can this SQL Query be optimized to run faster?

I have an SQL Query (For SQL Server 2008 R2) that takes a very long time to complete. I was wondering if there was a better way of doing it?
SELECT #count = COUNT(Name)
FROM Table1 t
WHERE t.Name = #name AND t.Code NOT IN (SELECT Code FROM ExcludedCodes)
Table1 has around 90Million rows in it and is indexed by Name and Code.
ExcludedCodes only has around 30 rows in it.
This query is in a stored procedure and gets called around 40k times, the total time it takes the procedure to finish is 27 minutes.. I believe this is my biggest bottleneck because of the massive amount of rows it queries against and the number of times it does it.
So if you know of a good way to optimize this it would be greatly appreciated! If it cannot be optimized then I guess im stuck with 27 min...
EDIT
I changed the NOT IN to NOT EXISTS and it cut the time down to 10:59, so that alone is a massive gain on my part. I am still going to attempt to do the group by statement as suggested below but that will require a complete rewrite of the stored procedure and might take some time... (as I said before, im not the best at SQL but it is starting to grow on me. ^^)
In addition to workarounds to get the query itself to respond faster, have you considered maintaining a column in the table that tells whether it is in this set or not? It requires a lot of maintenance but if the ExcludedCodes table does not change often, it might be better to do that maintenance. For example you could add a BIT column:
ALTER TABLE dbo.Table1 ADD IsExcluded BIT;
Make it NOT NULL and default to 0. Then you could create a filtered index:
CREATE INDEX n ON dbo.Table1(name)
WHERE IsExcluded = 0;
Now you just have to update the table once:
UPDATE t
SET IsExcluded = 1
FROM dbo.Table1 AS t
INNER JOIN dbo.ExcludedCodes AS x
ON t.Code = x.Code;
And ongoing you'd have to maintain this with triggers on both tables. With this in place, your query becomes:
SELECT #Count = COUNT(Name)
FROM dbo.Table1 WHERE IsExcluded = 0;
EDIT
As for "NOT IN being slower than LEFT JOIN" here is a simple test I performed on only a few thousand rows:
EDIT 2
I'm not sure why this query wouldn't do what you're after, and be far more efficient than your 40K loop:
SELECT src.Name, COUNT(src.*)
FROM dbo.Table1 AS src
INNER JOIN #temptable AS t
ON src.Name = t.Name
WHERE src.Code NOT IN (SELECT Code FROM dbo.ExcludedCodes)
GROUP BY src.Name;
Or the LEFT JOIN equivalent:
SELECT src.Name, COUNT(src.*)
FROM dbo.Table1 AS src
INNER JOIN #temptable AS t
ON src.Name = t.Name
LEFT OUTER JOIN dbo.ExcludedCodes AS x
ON src.Code = x.Code
WHERE x.Code IS NULL
GROUP BY src.Name;
I would put money on either of those queries taking less than 27 minutes. I would even suggest that running both queries sequentially will be far faster than your one query that takes 27 minutes.
Finally, you might consider an indexed view. I don't know your table structure and whether your violate any of the restrictions but it is worth investigating IMHO.
You say this gets called around 40K times. WHy? Is it in a cursor? If so do you really need a cursor. Couldn't you put the values you want for #name in a temp table and index it and then join to it?
select t.name, count(t.name)
from table t
join #name n on t.name = n.name
where NOT EXISTS (SELECT Code FROM ExcludedCodes WHERE Code = t.code)
group by t.name
That might get you all your results in one query and is almost certainly faster than 40K separate queries. Of course if you need the count of all the names, it's even simpleer
select t.name, count(t.name)
from table t
NOT EXISTS (SELECT Code FROM ExcludedCodes WHERE Code = t
group by t.name
NOT EXISTS typically performs better than NOT IN, but you should test it on your system.
SELECT #count = COUNT(Name)
FROM Table1 t
WHERE t.Name = #name AND NOT EXISTS (SELECT 1 FROM ExcludedCodes e WHERE e.Code = t.Code)
Without knowing more about your query it's tough to supply concrete optimization suggestions (i.e. code suitable for copy/paste). Does it really need to run 40,000 times? Sounds like your stored procedure needs reworking, if that's feasible. You could exec the above once at the start of the proc and insert the results in a temp table, which can keep the indexes from Table1, and then join on that instead of running this query.
This particular bit might not even be the bottleneck that makes your query run 27 minutes. For example, are you using a cursor over those 90 million rows, or scalar valued UDFs in your WHERE clauses?
Have you thought about doing the query once and populating the data in a table variable or temp table? Something like
insert into #temp (name, Namecount)
values Name, Count(name)
from table1
where name not in(select code from excludedcodes)
group by name
And don't forget that you could possibly use a filtered index as long as the excluded codes table is somewhat static.
Start evaluating the execution plan. Which is the heaviest part to compute?
Regarding the relation between the two tables, use a JOIN on indexed columns: indexes will optimize query execution.